From perbu at varnish-cache.org Thu Jun 6 08:10:14 2013 From: perbu at varnish-cache.org (Per Buer) Date: Thu, 06 Jun 2013 10:10:14 +0200 Subject: [master] 59b4902 Rewrite Vary. Use A-L instead of A-E. A-E is handled by varnish anyway and this is confusing Message-ID: commit 59b490254d4edc7d87a7c09d24ad7a2ac3a9e6f4 Author: Per Buer Date: Thu Jun 6 10:10:09 2013 +0200 Rewrite Vary. Use A-L instead of A-E. A-E is handled by varnish anyway and this is confusing diff --git a/doc/sphinx/users-guide/vary.rst b/doc/sphinx/users-guide/vary.rst index 31c3bde..a171992 100644 --- a/doc/sphinx/users-guide/vary.rst +++ b/doc/sphinx/users-guide/vary.rst @@ -3,39 +3,43 @@ HTTP Vary --------- +_HTTP Vary is not a trivial concept. It is by far the most +misunderstood HTTP header._ + The Vary header is sent by the web server to indicate what makes a HTTP object Vary. This makes a lot of sense with headers like -Accept-Encoding. When a server issues a "Vary: Accept-Encoding" it -tells Varnish that its needs to cache a separate version for every -different Accept-Encoding that is coming from the clients. So, if a -clients only accepts gzip encoding Varnish won't serve the version of -the page encoded with the deflate encoding. - -The problem is that the Accept-Encoding field contains a lot of -different encodings. If one browser sends:: - - Accept-Encoding: gzip,deflate - -And another one sends:: - - Accept-Encoding: deflate,gzip - -Varnish will keep two variants of the page requested due to the -different Accept-Encoding headers. Normalizing the accept-encoding -header will sure that you have as few variants as possible. The -following VCL code will normalize the Accept-Encoding headers:: - - if (req.http.Accept-Encoding) { - if (req.url ~ "\.(jpg|png|gif|gz|tgz|bz2|tbz|mp3|ogg)$") { - # No point in compressing these - remove req.http.Accept-Encoding; - } elsif (req.http.Accept-Encoding ~ "gzip") { - set req.http.Accept-Encoding = "gzip"; - } elsif (req.http.Accept-Encoding ~ "deflate") { - set req.http.Accept-Encoding = "deflate"; +Accept-Language. When a server issues a "Vary: Accept-Accept" it tells +Varnish that its needs to cache a separate version for every different +Accept-Language that is coming from the clients. + +So, if a client says it accepts the languages "en-us, en-uk" Varnish +will serve a different version to a client that says it accepts the +languages "da, de". + +Please note that the headers that Vary refer to need to match +_exactly_ for there to be a match. So Varnish will keep two copies of +a page if one of them was created for "en-us, en-uk" and the other for +"en-us,en-uk". + +To achieve a high hitrate whilst using Vary is there therefor crucial +to normalize the headers the backends varies on. Remember, just a +differce in case can force different cache entries. + + +The following VCL code will normalize the Accept-Language headers, to +one of either "en","de" or "fr":: + + if (req.http.Accept-Language) { + if (req.http.Accept-Language ~ "en") { + set req.http.Accept-Language = "en"; + } elsif (req.http.Accept-Language ~ "de") { + set req.http.Accept-Language = "de"; + } elsif (req.http.Accept-Language ~ "fr") { + set req.http.Accept-Language = "fr"; } else { - # unknown algorithm - remove req.http.Accept-Encoding; + # unknown language. Remove the accept-language header and + # use the backend default. + remove req.http.Accept-Language } } From perbu at varnish-cache.org Thu Jun 6 08:23:19 2013 From: perbu at varnish-cache.org (Per Buer) Date: Thu, 06 Jun 2013 10:23:19 +0200 Subject: [master] 934fd59 Language and markup polish. Feedback from dharrigan and bjorn Message-ID: commit 934fd591e301fbbf4e412bd8cb3fbaf50de22b5d Author: Per Buer Date: Thu Jun 6 10:23:14 2013 +0200 Language and markup polish. Feedback from dharrigan and bjorn diff --git a/doc/sphinx/users-guide/vary.rst b/doc/sphinx/users-guide/vary.rst index a171992..0bf5638 100644 --- a/doc/sphinx/users-guide/vary.rst +++ b/doc/sphinx/users-guide/vary.rst @@ -3,8 +3,8 @@ HTTP Vary --------- -_HTTP Vary is not a trivial concept. It is by far the most -misunderstood HTTP header._ +*HTTP Vary is not a trivial concept. It is by far the most +misunderstood HTTP header.* The Vary header is sent by the web server to indicate what makes a HTTP object Vary. This makes a lot of sense with headers like @@ -12,19 +12,22 @@ Accept-Language. When a server issues a "Vary: Accept-Accept" it tells Varnish that its needs to cache a separate version for every different Accept-Language that is coming from the clients. -So, if a client says it accepts the languages "en-us, en-uk" Varnish -will serve a different version to a client that says it accepts the -languages "da, de". +If two clients say they accept the languages "en-us, en-uk" and "da, de" +respectively, Varnish will cache and serve two different versions of +the page. + +So, if a client says it accepts the languages "en-us, en-uk", Varnish +will serve a different version to another client that says it accepts +the languages "da, de". Please note that the headers that Vary refer to need to match -_exactly_ for there to be a match. So Varnish will keep two copies of +*exactly* for there to be a match. So Varnish will keep two copies of a page if one of them was created for "en-us, en-uk" and the other for "en-us,en-uk". To achieve a high hitrate whilst using Vary is there therefor crucial to normalize the headers the backends varies on. Remember, just a -differce in case can force different cache entries. - +difference in case can force different cache entries. The following VCL code will normalize the Accept-Language headers, to one of either "en","de" or "fr":: From phk at varnish-cache.org Thu Jun 6 08:35:11 2013 From: phk at varnish-cache.org (Poul-Henning Kamp) Date: Thu, 06 Jun 2013 10:35:11 +0200 Subject: [master] f3eedb2 Subtly change this test-case. Message-ID: commit f3eedb220246c5e6e1768ae752af52c0387084a1 Author: Poul-Henning Kamp Date: Thu Jun 6 08:34:08 2013 +0000 Subtly change this test-case. We may have to retire it, it is quite flakey overall for reasons that are not worth the bother to work around. diff --git a/bin/varnishtest/tests/r01030.vtc b/bin/varnishtest/tests/r01030.vtc index 4f95d70..8b7d403 100644 --- a/bin/varnishtest/tests/r01030.vtc +++ b/bin/varnishtest/tests/r01030.vtc @@ -39,7 +39,7 @@ client c1 { expect resp.status == 201 } -run -delay 0.1 +#delay 0.1 varnish v1 -expect bans_tests_tested == 0 delay 1.0 @@ -57,7 +57,7 @@ client c2 { expect resp.status == 201 } -run -delay 0.1 +#delay 0.1 varnish v1 -expect bans_tests_tested == 1 delay 1.1 From phk at varnish-cache.org Thu Jun 6 08:35:28 2013 From: phk at varnish-cache.org (Poul-Henning Kamp) Date: Thu, 06 Jun 2013 10:35:28 +0200 Subject: [master] fb2be31 Send the fetch-work to a different thread. Message-ID: commit fb2be3120750b2ad2a9ac77b0ea16dd56ca2b140 Author: Poul-Henning Kamp Date: Thu Jun 6 08:35:16 2013 +0000 Send the fetch-work to a different thread. diff --git a/bin/varnishd/cache/cache_fetch.c b/bin/varnishd/cache/cache_fetch.c index 7358c43..9acea11 100644 --- a/bin/varnishd/cache/cache_fetch.c +++ b/bin/varnishd/cache/cache_fetch.c @@ -1034,7 +1034,7 @@ VBF_Fetch(struct worker *wrk, struct req *req) bo->fetch_task.priv = &vsh; bo->fetch_task.func = vbf_fetch_thread; - // if (Pool_Task(wrk->pool, &bo->fetch_task, POOL_QUEUE_FRONT)) + if (Pool_Task(wrk->pool, &bo->fetch_task, POOL_QUEUE_FRONT)) vbf_fetch_thread(wrk, &vsh); while (req != NULL) { printf("XXX\n"); From perbu at varnish-cache.org Thu Jun 6 08:38:16 2013 From: perbu at varnish-cache.org (Per Buer) Date: Thu, 06 Jun 2013 10:38:16 +0200 Subject: [master] 42ee660 More polish. Removed a duplicated sentence Message-ID: commit 42ee660365d70a862f51ad10cbc58555966c4fc4 Author: Per Buer Date: Thu Jun 6 10:30:28 2013 +0200 More polish. Removed a duplicated sentence diff --git a/doc/sphinx/users-guide/vary.rst b/doc/sphinx/users-guide/vary.rst index 0bf5638..9242c22 100644 --- a/doc/sphinx/users-guide/vary.rst +++ b/doc/sphinx/users-guide/vary.rst @@ -16,14 +16,11 @@ If two clients say they accept the languages "en-us, en-uk" and "da, de" respectively, Varnish will cache and serve two different versions of the page. -So, if a client says it accepts the languages "en-us, en-uk", Varnish -will serve a different version to another client that says it accepts -the languages "da, de". - Please note that the headers that Vary refer to need to match *exactly* for there to be a match. So Varnish will keep two copies of a page if one of them was created for "en-us, en-uk" and the other for -"en-us,en-uk". +"en-us,en-uk". Just the lack of space will force Varnish to cache +another version. To achieve a high hitrate whilst using Vary is there therefor crucial to normalize the headers the backends varies on. Remember, just a From perbu at varnish-cache.org Thu Jun 6 08:38:16 2013 From: perbu at varnish-cache.org (Per Buer) Date: Thu, 06 Jun 2013 10:38:16 +0200 Subject: [master] 7cfa03c Accept-accept Message-ID: commit 7cfa03ce19dbfc2151c2a31c788ff87dc7c3dc00 Author: Per Buer Date: Thu Jun 6 10:32:22 2013 +0200 Accept-accept diff --git a/doc/sphinx/users-guide/vary.rst b/doc/sphinx/users-guide/vary.rst index 9242c22..40b891b 100644 --- a/doc/sphinx/users-guide/vary.rst +++ b/doc/sphinx/users-guide/vary.rst @@ -8,9 +8,10 @@ misunderstood HTTP header.* The Vary header is sent by the web server to indicate what makes a HTTP object Vary. This makes a lot of sense with headers like -Accept-Language. When a server issues a "Vary: Accept-Accept" it tells -Varnish that its needs to cache a separate version for every different -Accept-Language that is coming from the clients. +Accept-Language. When a backend server issues a "Vary: +Accept-Language" it tells Varnish that its needs to cache a separate +version for every different Accept-Language that is coming from the +clients. If two clients say they accept the languages "en-us, en-uk" and "da, de" respectively, Varnish will cache and serve two different versions of From perbu at varnish-cache.org Thu Jun 6 08:38:16 2013 From: perbu at varnish-cache.org (Per Buer) Date: Thu, 06 Jun 2013 10:38:16 +0200 Subject: [master] ef88646 Add intro Message-ID: commit ef88646a7fc1973700f09b0f0892053b45a0083e Author: Per Buer Date: Thu Jun 6 10:37:20 2013 +0200 Add intro diff --git a/doc/sphinx/users-guide/vary.rst b/doc/sphinx/users-guide/vary.rst index 40b891b..00c946c 100644 --- a/doc/sphinx/users-guide/vary.rst +++ b/doc/sphinx/users-guide/vary.rst @@ -6,16 +6,22 @@ HTTP Vary *HTTP Vary is not a trivial concept. It is by far the most misunderstood HTTP header.* -The Vary header is sent by the web server to indicate what makes a -HTTP object Vary. This makes a lot of sense with headers like -Accept-Language. When a backend server issues a "Vary: -Accept-Language" it tells Varnish that its needs to cache a separate -version for every different Accept-Language that is coming from the -clients. - -If two clients say they accept the languages "en-us, en-uk" and "da, de" -respectively, Varnish will cache and serve two different versions of -the page. +A lot of the response headers tell the client something about the HTTP +object being delivered. Clients can request different variants a an +HTTP object, based on their preference. Their preferences might cover +stuff like encoding or language. When a client prefers UK English this +is indicated through "Accept-Language: en-uk". Caches need to keep +these different variants apart and this is done through the HTTP +response header "Vary". + +When a backend server issues a "Vary: Accept-Language" it tells +Varnish that its needs to cache a separate version for every different +Accept-Language that is coming from the clients. + +If two clients say they accept the languages "en-us, en-uk" and "da, +de" respectively, Varnish will cache and serve two different versions +of the page if the backend indicated that Varnish needs to vary on the +Accept-Language header. Please note that the headers that Vary refer to need to match *exactly* for there to be a match. So Varnish will keep two copies of From perbu at varnish-cache.org Thu Jun 6 13:12:08 2013 From: perbu at varnish-cache.org (Per Buer) Date: Thu, 06 Jun 2013 15:12:08 +0200 Subject: [master] 51b1f51 Reference cleanup. Message-ID: commit 51b1f512efc2abeb3300e8f0f3713554f127fd56 Author: Per Buer Date: Thu Jun 6 12:45:28 2013 +0200 Reference cleanup. diff --git a/doc/sphinx/users-guide/cookies.rst b/doc/sphinx/users-guide/cookies.rst index 1df1851..c57478b 100644 --- a/doc/sphinx/users-guide/cookies.rst +++ b/doc/sphinx/users-guide/cookies.rst @@ -90,6 +90,7 @@ Cookies coming from the backend If your backend server sets a cookie using the Set-Cookie header Varnish will not cache the page in the default configuration. A -hit-for-pass object (see :ref:`users-guide-vcl_fetch_actions`) is created. +hit-for-pass object (see :ref:`user-guide-vcl_actions`) is created. So, if the backend server acts silly and sets unwanted cookies just unset the Set-Cookie header and all should be fine. + diff --git a/doc/sphinx/users-guide/vcl-built-in-subs.rst b/doc/sphinx/users-guide/vcl-built-in-subs.rst index 99c1641..d1c31e6 100644 --- a/doc/sphinx/users-guide/vcl-built-in-subs.rst +++ b/doc/sphinx/users-guide/vcl-built-in-subs.rst @@ -1,5 +1,5 @@ -.. vcl-built-in-subs_ +.. _vcl-built-in-subs: Built in subroutines -------------------- diff --git a/doc/sphinx/users-guide/vcl-example-manipulating-responses.rst b/doc/sphinx/users-guide/vcl-example-manipulating-responses.rst index 7362789..8afdb50 100644 --- a/doc/sphinx/users-guide/vcl-example-manipulating-responses.rst +++ b/doc/sphinx/users-guide/vcl-example-manipulating-responses.rst @@ -13,7 +13,7 @@ matches certain criteria:: } } -.. XXX ref hit-for-pass + We also remove any Set-Cookie headers in order to avoid a hit-for-pass -object to be created. +object to be created. See :ref:`user-guide-vcl_actions`. diff --git a/doc/sphinx/users-guide/vcl-syntax.rst b/doc/sphinx/users-guide/vcl-syntax.rst index 95cb6c6..8277033 100644 --- a/doc/sphinx/users-guide/vcl-syntax.rst +++ b/doc/sphinx/users-guide/vcl-syntax.rst @@ -88,4 +88,5 @@ To call a subroutine, use the call keyword followed by the subroutine's name: call pipe_if_local; -Varnish has quite a few built in subroutines that are called for each transaction as it flows through Varnish. See :ref:`vcl-built-in-subs`. +Varnish has quite a few built in subroutines that are called for each +transaction as it flows through Varnish. See :ref:`vcl-built-in-subs`. From perbu at varnish-cache.org Thu Jun 6 13:12:08 2013 From: perbu at varnish-cache.org (Per Buer) Date: Thu, 06 Jun 2013 15:12:08 +0200 Subject: [master] d2e880f How to debug crashing child. Message-ID: commit d2e880fff39c7673652ed83043a8e7361a4ebb5a Author: Per Buer Date: Thu Jun 6 12:45:51 2013 +0200 How to debug crashing child. diff --git a/doc/sphinx/users-guide/troubleshooting.rst b/doc/sphinx/users-guide/troubleshooting.rst index 51797d8..1735c2b 100644 --- a/doc/sphinx/users-guide/troubleshooting.rst +++ b/doc/sphinx/users-guide/troubleshooting.rst @@ -67,8 +67,20 @@ errors will be logged in syslog. It might look like this:: In this situation the mother process assumes that the cache died and killed it off. -XXX: Describe crashing child process and crashing mother process here too. -XXX: panic.show +In certain situation the child process might crash itself. This might +happen because internal integrity checks fail as a result of a bug. + +In these situations the child will start back up again right away but +the cache will be cleared. A panic is logged with the mother +process. You can inspect the stack trace with the CLI command +panic.show. + +Some of these situations might be caused by bugs, other by +misconfigations. Often we see varnish running out of session +workspace, which will result in the child aborting its execution. + +In a rare event you might also see a segmentation fault or bus +error. These are either bugs, kernel- or hardware failures. Varnish gives me Guru meditation -------------------------------- From perbu at varnish-cache.org Thu Jun 6 13:12:08 2013 From: perbu at varnish-cache.org (Per Buer) Date: Thu, 06 Jun 2013 15:12:08 +0200 Subject: [master] 80e416a objects in VCL. Correct what was wrong and describe obj. Message-ID: commit 80e416aa8e367ed8f3451abe683d73f71adb2652 Author: Per Buer Date: Thu Jun 6 12:46:41 2013 +0200 objects in VCL. Correct what was wrong and describe obj. diff --git a/doc/sphinx/users-guide/vcl-variables.rst b/doc/sphinx/users-guide/vcl-variables.rst index 7064bc0..978f5e4 100644 --- a/doc/sphinx/users-guide/vcl-variables.rst +++ b/doc/sphinx/users-guide/vcl-variables.rst @@ -2,13 +2,8 @@ Requests, responses and objects ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -In VCL, there are three important data structures. The request, coming -from the client, the response coming from the backend server and the -object, stored in cache. +In VCL, there several important objects. -In VCL you should know the following structures. - -.. XXX: Needs verification *req* The request object. When Varnish has received the request the req object is @@ -24,5 +19,7 @@ In VCL you should know the following structures. do on the beresp object. *resp* - The cached object. Mostly a read only object that resides in memory. - resp.ttl is writable, the rest is read only. + The HTTP response right before it is delivered to the client. + +*obj* + The object as it is stored in cache. Mostly read only. From perbu at varnish-cache.org Thu Jun 6 13:12:08 2013 From: perbu at varnish-cache.org (Per Buer) Date: Thu, 06 Jun 2013 15:12:08 +0200 Subject: [master] 0acece7 Ruben added reference to VMOD directory. Message-ID: commit 0acece742dfd2fcd86f1f9f162618e4b2977ebe6 Author: Per Buer Date: Thu Jun 6 15:11:56 2013 +0200 Ruben added reference to VMOD directory. diff --git a/doc/sphinx/reference/vmod.rst b/doc/sphinx/reference/vmod.rst index 88ff800..6d5d0ca 100644 --- a/doc/sphinx/reference/vmod.rst +++ b/doc/sphinx/reference/vmod.rst @@ -25,9 +25,17 @@ function shown above. The full contents of the "std" module is documented in vmod_std(7). This part of the manual is about how you go about writing your own -VMOD, how the language interface between C and VCC works etc. This -explanation will use the "std" VMOD as example, having a varnish -source tree handy may be a good idea. +VMOD, how the language interface between C and VCC works, where you +can find contributed VMODs etc. This explanation will use the "std" +VMOD as example, having a varnish source tree handy may be a good +idea. + +VMOD Directory +============== + +The VMOD directory is an up-to-date compilation of maintained +extensions written for Varnish Cache: + https://www.varnish-cache.org/vmods The vmod.vcc file ================= From phk at varnish-cache.org Tue Jun 11 10:19:52 2013 From: phk at varnish-cache.org (Poul-Henning Kamp) Date: Tue, 11 Jun 2013 12:19:52 +0200 Subject: [master] bcd514d Fix two bugs in ACL compile code. Message-ID: commit bcd514d3ffdf24ed3fd1253679deca62ce2cf1aa Author: Poul-Henning Kamp Date: Tue Jun 11 10:19:09 2013 +0000 Fix two bugs in ACL compile code. Fixes #1312 See Also: CVE-2013-4090 diff --git a/bin/varnishtest/tests/r01312.vtc b/bin/varnishtest/tests/r01312.vtc new file mode 100644 index 0000000..05003ea --- /dev/null +++ b/bin/varnishtest/tests/r01312.vtc @@ -0,0 +1,28 @@ +varnishtest "acl miscompile" + +server s1 { + rxreq + txresp +} -start + +varnish v1 -vcl+backend { + acl foo { + "127.0.0.2"; + "127.0.1"/19; + } + acl bar { + "127.0.1.2"; + "127.0.1"/19; + } + sub vcl_deliver { + set resp.http.ACLfoo = client.ip ~ foo; + set resp.http.ACLbar = client.ip ~ bar; + } +} -start + +client c1 { + txreq + rxresp + expect resp.http.aclfoo == true + expect resp.http.aclbar == true +} -run diff --git a/lib/libvcl/vcc_acl.c b/lib/libvcl/vcc_acl.c index 9c9e117..eb3bace 100644 --- a/lib/libvcl/vcc_acl.c +++ b/lib/libvcl/vcc_acl.c @@ -381,7 +381,7 @@ vcc_acl_emit(const struct vcc *tl, const char *acln, int anon) VTAILQ_FOREACH(ae, &tl->acl, list) { /* Find how much common prefix we have */ - for (l = 0; l <= depth && l * 8 < ae->mask; l++) { + for (l = 0; l <= depth && l * 8 < ae->mask - 7; l++) { assert(l >= 0); if (ae->data[l] != at[l]) break; @@ -392,11 +392,11 @@ vcc_acl_emit(const struct vcc *tl, const char *acln, int anon) while (l <= depth) { Fh(tl, 0, "\t%*s}\n", -depth, ""); depth--; - oc = "else "; } m = ae->mask; m -= l * 8; + assert(m >= 0); /* Do whole byte compares */ for (i = l; m >= 8; m -= 8, i++) { From tfheen at varnish-cache.org Tue Jun 11 10:42:25 2013 From: tfheen at varnish-cache.org (Tollef Fog Heen) Date: Tue, 11 Jun 2013 12:42:25 +0200 Subject: [3.0] 987ba4c Fix two bugs in ACL compile code. Message-ID: commit 987ba4c49facf31978d0e6e395a33f053536feed Author: Poul-Henning Kamp Date: Tue Jun 11 10:19:09 2013 +0000 Fix two bugs in ACL compile code. Fixes #1312 See Also: CVE-2013-4090 diff --git a/bin/varnishtest/tests/r01312.vtc b/bin/varnishtest/tests/r01312.vtc new file mode 100644 index 0000000..05003ea --- /dev/null +++ b/bin/varnishtest/tests/r01312.vtc @@ -0,0 +1,28 @@ +varnishtest "acl miscompile" + +server s1 { + rxreq + txresp +} -start + +varnish v1 -vcl+backend { + acl foo { + "127.0.0.2"; + "127.0.1"/19; + } + acl bar { + "127.0.1.2"; + "127.0.1"/19; + } + sub vcl_deliver { + set resp.http.ACLfoo = client.ip ~ foo; + set resp.http.ACLbar = client.ip ~ bar; + } +} -start + +client c1 { + txreq + rxresp + expect resp.http.aclfoo == true + expect resp.http.aclbar == true +} -run diff --git a/lib/libvcl/vcc_acl.c b/lib/libvcl/vcc_acl.c index 3e5ac6c..fa78dab 100644 --- a/lib/libvcl/vcc_acl.c +++ b/lib/libvcl/vcc_acl.c @@ -383,7 +383,7 @@ vcc_acl_emit(const struct vcc *tl, const char *acln, int anon) VTAILQ_FOREACH(ae, &tl->acl, list) { /* Find how much common prefix we have */ - for (l = 0; l <= depth && l * 8 < ae->mask; l++) { + for (l = 0; l <= depth && l * 8 < ae->mask - 7; l++) { assert(l >= 0); if (ae->data[l] != at[l]) break; @@ -394,11 +394,11 @@ vcc_acl_emit(const struct vcc *tl, const char *acln, int anon) while (l <= depth) { Fh(tl, 0, "\t%*s}\n", -depth, ""); depth--; - oc = "else "; } m = ae->mask; m -= l * 8; + assert(m >= 0); /* Do whole byte compares */ for (i = l; m >= 8; m -= 8, i++) { From tfheen at varnish-cache.org Wed Jun 12 08:19:23 2013 From: tfheen at varnish-cache.org (Tollef Fog Heen) Date: Wed, 12 Jun 2013 10:19:23 +0200 Subject: [3.0] 2e0cd51 Return an error on duplicated Host headers Message-ID: commit 2e0cd51d8bd1fab963bc8e57d9011fb3537674f1 Author: Tollef Fog Heen Date: Wed Jun 12 10:18:59 2013 +0200 Return an error on duplicated Host headers diff --git a/bin/varnishd/cache.h b/bin/varnishd/cache.h index d7f6ab8..be319df 100644 --- a/bin/varnishd/cache.h +++ b/bin/varnishd/cache.h @@ -769,6 +769,7 @@ double http_GetHdrQ(const struct http *hp, const char *hdr, const char *field); uint16_t http_GetStatus(const struct http *hp); const char *http_GetReq(const struct http *hp); int http_HdrIs(const struct http *hp, const char *hdr, const char *val); +int http_IsHdr(const txt *hh, const char *hdr); uint16_t http_DissectRequest(struct sess *sp); uint16_t http_DissectResponse(struct worker *w, const struct http_conn *htc, struct http *sp); diff --git a/bin/varnishd/cache_http.c b/bin/varnishd/cache_http.c index 76b3f86..8753acc 100644 --- a/bin/varnishd/cache_http.c +++ b/bin/varnishd/cache_http.c @@ -156,7 +156,7 @@ http_Setup(struct http *hp, struct ws *ws) /*--------------------------------------------------------------------*/ -static int +int http_IsHdr(const txt *hh, const char *hdr) { unsigned l; @@ -638,6 +638,28 @@ http_splitline(struct worker *w, int fd, struct http *hp, /*--------------------------------------------------------------------*/ +static int +htc_request_check_host_hdr(struct http *hp) +{ + int u; + int seen_host = 0; + for (u = HTTP_HDR_FIRST; u < hp->nhd; u++) { + if (hp->hd[u].b == NULL) + continue; + AN(hp->hd[u].b); + AN(hp->hd[u].e); + if (http_IsHdr(&hp->hd[u], H_Host)) { + if (seen_host) { + return (400); + } + seen_host = 1; + } + } + return (0); +} + +/*--------------------------------------------------------------------*/ + static void http_ProtoVer(struct http *hp) { @@ -675,6 +697,12 @@ http_DissectRequest(struct sess *sp) return (retval); } http_ProtoVer(hp); + + retval = htc_request_check_host_hdr(hp); + if (retval != 0) { + WSP(sp, SLT_Error, "Duplicated Host header"); + return (retval); + } return (retval); } diff --git a/bin/varnishtest/tests/b00037.vtc b/bin/varnishtest/tests/b00037.vtc new file mode 100644 index 0000000..42b23ab --- /dev/null +++ b/bin/varnishtest/tests/b00037.vtc @@ -0,0 +1,19 @@ +varnishtest "Error on multiple Host headers" + +server s1 { + rxreq + txresp +} -start + +varnish v1 -vcl+backend { +} -start + +client c1 { + txreq -hdr "Host: foo" -hdr "Host: bar" +} -run + +varnish v1 -expect sess_closed == 1 +varnish v1 -expect client_req == 1 +varnish v1 -expect cache_hit == 0 +varnish v1 -expect cache_hitpass == 0 +varnish v1 -expect cache_miss == 0 From tfheen at varnish-cache.org Wed Jun 12 10:06:16 2013 From: tfheen at varnish-cache.org (Tollef Fog Heen) Date: Wed, 12 Jun 2013 12:06:16 +0200 Subject: [master] f3b856b Drop embedded jemalloc Message-ID: commit f3b856b8f6644b5e0d5f2ca2af1ea8c6bc9ec365 Author: Tollef Fog Heen Date: Wed Jun 12 12:00:34 2013 +0200 Drop embedded jemalloc The embedded jemalloc we ship is outdated and jemalloc is now generally available in distributions, so stop shipping our own copy. diff --git a/configure.ac b/configure.ac index f49f3fe..3c9add4 100644 --- a/configure.ac +++ b/configure.ac @@ -230,7 +230,6 @@ fi CFLAGS="${save_CFLAGS}" # Use jemalloc on Linux -JEMALLOC_SUBDIR= JEMALLOC_LDADD= AC_ARG_WITH([jemalloc], [AS_HELP_STRING([--with-jemalloc], @@ -243,13 +242,10 @@ case $target in if test "x$with_jemalloc" != xno; then AC_CHECK_LIB([jemalloc], [malloc_conf], [JEMALLOC_LDADD="-ljemalloc"], - [AC_MSG_NOTICE([No system jemalloc found, using bundled version]) - JEMALLOC_SUBDIR=libjemalloc - JEMALLOC_LDADD='$(top_builddir)/lib/libjemalloc/libjemalloc_mt.la']) + [AC_MSG_WARN([No system jemalloc found, using system malloc])]) fi ;; esac -AC_SUBST(JEMALLOC_SUBDIR) AC_SUBST(JEMALLOC_LDADD) # Userland slab allocator, available only on Solaris @@ -598,7 +594,6 @@ AC_CONFIG_FILES([ lib/libvmod_debug/Makefile lib/libvmod_std/Makefile lib/libvmod_directors/Makefile - lib/libjemalloc/Makefile man/Makefile redhat/Makefile varnishapi.pc diff --git a/lib/Makefile.am b/lib/Makefile.am index 094f42e..77c5ac5 100644 --- a/lib/Makefile.am +++ b/lib/Makefile.am @@ -8,8 +8,7 @@ SUBDIRS = \ libvgz \ libvmod_debug \ libvmod_std \ - libvmod_directors \ - @JEMALLOC_SUBDIR@ + libvmod_directors DIST_SUBDIRS = \ libvarnishcompat \ @@ -19,5 +18,4 @@ DIST_SUBDIRS = \ libvgz \ libvmod_debug \ libvmod_std \ - libvmod_directors \ - libjemalloc + libvmod_directors diff --git a/lib/libjemalloc/Makefile.am b/lib/libjemalloc/Makefile.am deleted file mode 100644 index 63f5c7a..0000000 --- a/lib/libjemalloc/Makefile.am +++ /dev/null @@ -1,18 +0,0 @@ -# See source code comments to avoid memory leaks when enabling MALLOC_MAG. -#CPPFLAGS = -DMALLOC_PRODUCTION -DMALLOC_MAG -AM_CPPFLAGS = -DMALLOC_PRODUCTION - -#all: libjemalloc.so.0 libjemalloc_mt.so.0 - -noinst_LTLIBRARIES = libjemalloc_mt.la - -libjemalloc_mt_la_LIBADD = ${PTHREAD_LIBS} -libjemalloc_mt_la_LDFLAGS = -static -libjemalloc_mt_la_CFLAGS = -D__isthreaded=true - -libjemalloc_mt_la_SOURCES = jemalloc_linux.c \ - rb.h - -EXTRA_DIST = malloc.3 \ - malloc.c \ - README diff --git a/lib/libjemalloc/README b/lib/libjemalloc/README deleted file mode 100644 index 5b80997..0000000 --- a/lib/libjemalloc/README +++ /dev/null @@ -1,55 +0,0 @@ -This is a minimal-effort stand-alone jemalloc distribution for Linux. The main -rough spots are: - -* __isthreaded must be hard-coded, since the pthreads library really needs to - be involved in order to toggle it at run time. Therefore, this distribution - builds two separate libraries: - - + libjemalloc_mt.so.0 : Use for multi-threaded applications. - + libjemalloc.so.0 : Use for single-threaded applications. - - Both libraries link against libpthread, though with a bit more code hacking, - this dependency could be removed for the single-threaded version. - -* MALLOC_MAG (thread-specific caching, using magazines) is disabled, because - special effort is required to avoid memory leaks when it is enabled. To make - cleanup automatic, we would need help from the pthreads library. If you - enable MALLOC_MAG, be sure to call _malloc_thread_cleanup() in each thread - just before it exits. - -* The code that determines the number of CPUs is sketchy. The trouble is that - we must avoid any memory allocation during early initialization. - -In order to build: - - make - -This generates two shared libraries, which you can either link against, or -pre-load. - -Linking and running, where /path/to is the path to libjemalloc (-lpthread -required even for libjemalloc.so): - - gcc app.o -o app -L/path/to -ljemalloc_mt -lpthread - LD_LIBRARY_PATH=/path/to app - -Pre-loading: - - LD_PRELOAD=/path/to/libjemalloc_mt.so.0 app - -jemalloc has a lot of run-time tuning options. See the man page for details: - - nroff -man malloc.3 | less - -In particular, take a look at the B, F, and N options. If you enable -MALLOC_MAG, look at the G and R options. - -If your application is crashing, or performance seems to be lacking, enable -assertions and statistics gathering by removing MALLOC_PRODUCTION from CPPFLAGS -in the Makefile. In order to print a statistics summary at program exit, run -your application like: - - LD_PRELOAD=/path/to/libjemalloc_mt.so.0 MALLOC_OPTIONS=P app - -Please contact Jason Evans with questions, comments, bug -reports, etc. diff --git a/lib/libjemalloc/jemalloc_linux.c b/lib/libjemalloc/jemalloc_linux.c deleted file mode 100644 index 170777b..0000000 --- a/lib/libjemalloc/jemalloc_linux.c +++ /dev/null @@ -1,5696 +0,0 @@ -/*- - * Copyright (C) 2006-2008 Jason Evans . - * All rights reserved. - * - * Redistribution and use in source and binary forms, with or without - * modification, are permitted provided that the following conditions - * are met: - * 1. Redistributions of source code must retain the above copyright - * notice(s), this list of conditions and the following disclaimer as - * the first lines of this file unmodified other than the possible - * addition of one or more copyright notices. - * 2. Redistributions in binary form must reproduce the above copyright - * notice(s), this list of conditions and the following disclaimer in - * the documentation and/or other materials provided with the - * distribution. - * - * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDER(S) ``AS IS'' AND ANY - * EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE - * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR - * PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER(S) BE - * LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR - * CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF - * SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR - * BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, - * WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE - * OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, - * EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. - * - ******************************************************************************* - * - * This allocator implementation is designed to provide scalable performance - * for multi-threaded programs on multi-processor systems. The following - * features are included for this purpose: - * - * + Multiple arenas are used if there are multiple CPUs, which reduces lock - * contention and cache sloshing. - * - * + Thread-specific caching is used if there are multiple threads, which - * reduces the amount of locking. - * - * + Cache line sharing between arenas is avoided for internal data - * structures. - * - * + Memory is managed in chunks and runs (chunks can be split into runs), - * rather than as individual pages. This provides a constant-time - * mechanism for associating allocations with particular arenas. - * - * Allocation requests are rounded up to the nearest size class, and no record - * of the original request size is maintained. Allocations are broken into - * categories according to size class. Assuming runtime defaults, 4 kB pages - * and a 16 byte quantum on a 32-bit system, the size classes in each category - * are as follows: - * - * |=======================================| - * | Category | Subcategory | Size | - * |=======================================| - * | Small | Tiny | 2 | - * | | | 4 | - * | | | 8 | - * | |------------------+---------| - * | | Quantum-spaced | 16 | - * | | | 32 | - * | | | 48 | - * | | | ... | - * | | | 96 | - * | | | 112 | - * | | | 128 | - * | |------------------+---------| - * | | Cacheline-spaced | 192 | - * | | | 256 | - * | | | 320 | - * | | | 384 | - * | | | 448 | - * | | | 512 | - * | |------------------+---------| - * | | Sub-page | 760 | - * | | | 1024 | - * | | | 1280 | - * | | | ... | - * | | | 3328 | - * | | | 3584 | - * | | | 3840 | - * |=======================================| - * | Large | 4 kB | - * | | 8 kB | - * | | 12 kB | - * | | ... | - * | | 1012 kB | - * | | 1016 kB | - * | | 1020 kB | - * |=======================================| - * | Huge | 1 MB | - * | | 2 MB | - * | | 3 MB | - * | | ... | - * |=======================================| - * - * A different mechanism is used for each category: - * - * Small : Each size class is segregated into its own set of runs. Each run - * maintains a bitmap of which regions are free/allocated. - * - * Large : Each allocation is backed by a dedicated run. Metadata are stored - * in the associated arena chunk header maps. - * - * Huge : Each allocation is backed by a dedicated contiguous set of chunks. - * Metadata are stored in a separate red-black tree. - * - ******************************************************************************* - */ - -/* - * Set to false if single-threaded. Even better, rip out all of the code that - * doesn't get used if __isthreaded is false, so that libpthread isn't - * necessary. - */ -#ifndef __isthreaded -# define __isthreaded true -#endif - -/* - * MALLOC_PRODUCTION disables assertions and statistics gathering. It also - * defaults the A and J runtime options to off. These settings are appropriate - * for production systems. - */ -/* #define MALLOC_PRODUCTION */ - -#ifndef MALLOC_PRODUCTION - /* - * MALLOC_DEBUG enables assertions and other sanity checks, and disables - * inline functions. - */ -# define MALLOC_DEBUG - - /* MALLOC_STATS enables statistics calculation. */ -# define MALLOC_STATS -#endif - -/* - * MALLOC_TINY enables support for tiny objects, which are smaller than one - * quantum. - */ -#define MALLOC_TINY - -/* - * MALLOC_MAG enables a magazine-based thread-specific caching layer for small - * objects. This makes it possible to allocate/deallocate objects without any - * locking when the cache is in the steady state. - * - * If MALLOC_MAG is enabled, make sure that _malloc_thread_cleanup() is called - * by each thread just before it exits. - */ -/* #define MALLOC_MAG */ - -/* - * MALLOC_BALANCE enables monitoring of arena lock contention and dynamically - * re-balances arena load if exponentially averaged contention exceeds a - * certain threshold. - */ -#define MALLOC_BALANCE - -/* - * MALLOC_DSS enables use of sbrk(2) to allocate chunks from the data storage - * segment (DSS). In an ideal world, this functionality would be completely - * unnecessary, but we are burdened by history and the lack of resource limits - * for anonymous mapped memory. - */ -/* #define MALLOC_DSS */ - -#define _GNU_SOURCE /* For mremap(2). */ -#define issetugid() 0 -#define __DECONST(type, var) ((type)(uintptr_t)(const void *)(var)) - -/* __FBSDID("$FreeBSD: head/lib/libc/stdlib/malloc.c 182225 2008-08-27 02:00:53Z jasone $"); */ - -#include -#include -#include -#include -#include -#include - -#include -#include -#ifndef SIZE_T_MAX -# define SIZE_T_MAX SIZE_MAX -#endif -#include -#include -#include -#include -#include -#include -#include -#include -#include -#include -#include - -#include "rb.h" - -/* - * Prevent gcc from complaining about unused parameters. Added directly - * here instead of including ansidecl.h to save a build dependency on - * binutils-dev. - */ -#if __GNUC__ >= 3 -#ifndef ATTRIBUTE_UNUSED -#define ATTRIBUTE_UNUSED __attribute__ ((__unused__)) -#endif /* ATTRIBUTE_UNUSED */ -#else -#define ATTRIBUTE_UNUSED -#endif - -#ifdef MALLOC_DEBUG - /* Disable inlining to make debugging easier. */ -# define inline -#endif - -/* Size of stack-allocated buffer passed to strerror_r(). */ -#define STRERROR_BUF 64 - -/* - * The const_size2bin table is sized according to PAGESIZE_2POW, but for - * correctness reasons, we never assume that - * (pagesize == (1U << * PAGESIZE_2POW)). - * - * Minimum alignment of allocations is 2^QUANTUM_2POW bytes. - */ -#ifdef __i386__ -# define PAGESIZE_2POW 12 -# define QUANTUM_2POW 4 -# define SIZEOF_PTR_2POW 2 -# define CPU_SPINWAIT __asm__ volatile("pause") -#endif -#ifdef __ia64__ -# define PAGESIZE_2POW 12 -# define QUANTUM_2POW 4 -# define SIZEOF_PTR_2POW 3 -#endif -#ifdef __alpha__ -# define PAGESIZE_2POW 13 -# define QUANTUM_2POW 4 -# define SIZEOF_PTR_2POW 3 -# define NO_TLS -#endif -#ifdef __sparc__ -# define PAGESIZE_2POW 13 -# define QUANTUM_2POW 4 -# define SIZEOF_PTR_2POW 3 -# define NO_TLS -#endif -#ifdef __amd64__ -# define PAGESIZE_2POW 12 -# define QUANTUM_2POW 4 -# define SIZEOF_PTR_2POW 3 -# define CPU_SPINWAIT __asm__ volatile("pause") -#endif -#ifdef __arm__ -# define PAGESIZE_2POW 12 -# define QUANTUM_2POW 3 -# define SIZEOF_PTR_2POW 2 -# define NO_TLS -#endif -#ifdef __mips__ -# define PAGESIZE_2POW 12 -# define QUANTUM_2POW 3 -# define SIZEOF_PTR_2POW 2 -# define NO_TLS -#endif -#ifdef __powerpc__ -# define PAGESIZE_2POW 12 -# define QUANTUM_2POW 4 -# define SIZEOF_PTR_2POW 2 -#endif -#ifdef __s390__ -# define PAGESIZE_2POW 12 -# define QUANTUM_2POW 4 -# define SIZEOF_PTR_2POW 2 -#endif -#ifdef __s390x__ -# define PAGESIZE_2POW 12 -# define QUANTUM_2POW 4 -# define SIZEOF_PTR_2POW 3 -#endif -#ifdef __sh__ -# define PAGESIZE_2POW 12 -# define QUANTUM_2POW 3 -# define SIZEOF_PTR_2POW 2 -# define NO_TLS -#endif - -#define QUANTUM ((size_t)(1U << QUANTUM_2POW)) -#define QUANTUM_MASK (QUANTUM - 1) - -#define SIZEOF_PTR (1U << SIZEOF_PTR_2POW) - -/* sizeof(int) == (1U << SIZEOF_INT_2POW). */ -#ifndef SIZEOF_INT_2POW -# define SIZEOF_INT_2POW 2 -#endif - -/* We can't use TLS in non-PIC programs, since TLS relies on loader magic. */ -#if (!defined(PIC) && !defined(NO_TLS)) -# define NO_TLS -#endif - -#ifdef NO_TLS - /* MALLOC_MAG requires TLS. */ -# ifdef MALLOC_MAG -# undef MALLOC_MAG -# endif - /* MALLOC_BALANCE requires TLS. */ -# ifdef MALLOC_BALANCE -# undef MALLOC_BALANCE -# endif -#endif - -/* - * Size and alignment of memory chunks that are allocated by the OS's virtual - * memory system. - */ -#define CHUNK_2POW_DEFAULT 20 - -/* Maximum number of dirty pages per arena. */ -#define DIRTY_MAX_DEFAULT (1U << 9) - -/* - * Maximum size of L1 cache line. This is used to avoid cache line aliasing. - * In addition, this controls the spacing of cacheline-spaced size classes. - */ -#define CACHELINE_2POW 6 -#define CACHELINE ((size_t)(1U << CACHELINE_2POW)) -#define CACHELINE_MASK (CACHELINE - 1) - -/* - * Subpages are an artificially designated partitioning of pages. Their only - * purpose is to support subpage-spaced size classes. - * - * There must be at least 4 subpages per page, due to the way size classes are - * handled. - */ -#define SUBPAGE_2POW 8 -#define SUBPAGE ((size_t)(1U << SUBPAGE_2POW)) -#define SUBPAGE_MASK (SUBPAGE - 1) - -#ifdef MALLOC_TINY - /* Smallest size class to support. */ -# define TINY_MIN_2POW 1 -#endif - -/* - * Maximum size class that is a multiple of the quantum, but not (necessarily) - * a power of 2. Above this size, allocations are rounded up to the nearest - * power of 2. - */ -#define QSPACE_MAX_2POW_DEFAULT 7 - -/* - * Maximum size class that is a multiple of the cacheline, but not (necessarily) - * a power of 2. Above this size, allocations are rounded up to the nearest - * power of 2. - */ -#define CSPACE_MAX_2POW_DEFAULT 9 - -/* - * RUN_MAX_OVRHD indicates maximum desired run header overhead. Runs are sized - * as small as possible such that this setting is still honored, without - * violating other constraints. The goal is to make runs as small as possible - * without exceeding a per run external fragmentation threshold. - * - * We use binary fixed point math for overhead computations, where the binary - * point is implicitly RUN_BFP bits to the left. - * - * Note that it is possible to set RUN_MAX_OVRHD low enough that it cannot be - * honored for some/all object sizes, since there is one bit of header overhead - * per object (plus a constant). This constraint is relaxed (ignored) for runs - * that are so small that the per-region overhead is greater than: - * - * (RUN_MAX_OVRHD / (reg_size << (3+RUN_BFP)) - */ -#define RUN_BFP 12 -/* \/ Implicit binary fixed point. */ -#define RUN_MAX_OVRHD 0x0000003dU -#define RUN_MAX_OVRHD_RELAX 0x00001800U - -/* Put a cap on small object run size. This overrides RUN_MAX_OVRHD. */ -#define RUN_MAX_SMALL (12 * pagesize) - -/* - * Hyper-threaded CPUs may need a special instruction inside spin loops in - * order to yield to another virtual CPU. If no such instruction is defined - * above, make CPU_SPINWAIT a no-op. - */ -#ifndef CPU_SPINWAIT -# define CPU_SPINWAIT -#endif - -/* - * Adaptive spinning must eventually switch to blocking, in order to avoid the - * potential for priority inversion deadlock. Backing off past a certain point - * can actually waste time. - */ -#define SPIN_LIMIT_2POW 11 - -/* - * Conversion from spinning to blocking is expensive; we use (1U << - * BLOCK_COST_2POW) to estimate how many more times costly blocking is than - * worst-case spinning. - */ -#define BLOCK_COST_2POW 4 - -#ifdef MALLOC_MAG - /* - * Default magazine size, in bytes. max_rounds is calculated to make - * optimal use of the space, leaving just enough room for the magazine - * header. - */ -# define MAG_SIZE_2POW_DEFAULT 9 -#endif - -#ifdef MALLOC_BALANCE - /* - * We use an exponential moving average to track recent lock contention, - * where the size of the history window is N, and alpha=2/(N+1). - * - * Due to integer math rounding, very small values here can cause - * substantial degradation in accuracy, thus making the moving average decay - * faster than it would with precise calculation. - */ -# define BALANCE_ALPHA_INV_2POW 9 - - /* - * Threshold value for the exponential moving contention average at which to - * re-assign a thread. - */ -# define BALANCE_THRESHOLD_DEFAULT (1U << (SPIN_LIMIT_2POW-4)) -#endif - -/******************************************************************************/ - -typedef pthread_mutex_t malloc_mutex_t; -typedef pthread_mutex_t malloc_spinlock_t; - -/* Set to true once the allocator has been initialized. */ -static bool malloc_initialized = false; - -/* Used to avoid initialization races. */ -static malloc_mutex_t init_lock = PTHREAD_ADAPTIVE_MUTEX_INITIALIZER_NP; - -/******************************************************************************/ -/* - * Statistics data structures. - */ - -#ifdef MALLOC_STATS - -typedef struct malloc_bin_stats_s malloc_bin_stats_t; -struct malloc_bin_stats_s { - /* - * Number of allocation requests that corresponded to the size of this - * bin. - */ - uint64_t nrequests; - -#ifdef MALLOC_MAG - /* Number of magazine reloads from this bin. */ - uint64_t nmags; -#endif - - /* Total number of runs created for this bin's size class. */ - uint64_t nruns; - - /* - * Total number of runs reused by extracting them from the runs tree for - * this bin's size class. - */ - uint64_t reruns; - - /* High-water mark for this bin. */ - unsigned long highruns; - - /* Current number of runs in this bin. */ - unsigned long curruns; -}; - -typedef struct arena_stats_s arena_stats_t; -struct arena_stats_s { - /* Number of bytes currently mapped. */ - size_t mapped; - - /* - * Total number of purge sweeps, total number of madvise calls made, - * and total pages purged in order to keep dirty unused memory under - * control. - */ - uint64_t npurge; - uint64_t nmadvise; - uint64_t purged; - - /* Per-size-category statistics. */ - size_t allocated_small; - uint64_t nmalloc_small; - uint64_t ndalloc_small; - - size_t allocated_large; - uint64_t nmalloc_large; - uint64_t ndalloc_large; - -#ifdef MALLOC_BALANCE - /* Number of times this arena reassigned a thread due to contention. */ - uint64_t nbalance; -#endif -}; - -typedef struct chunk_stats_s chunk_stats_t; -struct chunk_stats_s { - /* Number of chunks that were allocated. */ - uint64_t nchunks; - - /* High-water mark for number of chunks allocated. */ - unsigned long highchunks; - - /* - * Current number of chunks allocated. This value isn't maintained for - * any other purpose, so keep track of it in order to be able to set - * highchunks. - */ - unsigned long curchunks; -}; - -#endif /* #ifdef MALLOC_STATS */ - -/******************************************************************************/ -/* - * Extent data structures. - */ - -/* Tree of extents. */ -typedef struct extent_node_s extent_node_t; -struct extent_node_s { -#ifdef MALLOC_DSS - /* Linkage for the size/address-ordered tree. */ - rb_node(extent_node_t) link_szad; -#endif - - /* Linkage for the address-ordered tree. */ - rb_node(extent_node_t) link_ad; - - /* Pointer to the extent that this tree node is responsible for. */ - void *addr; - - /* Total region size. */ - size_t size; -}; -typedef rb_tree(extent_node_t) extent_tree_t; - -/******************************************************************************/ -/* - * Arena data structures. - */ - -typedef struct arena_s arena_t; -typedef struct arena_bin_s arena_bin_t; - -/* Each element of the chunk map corresponds to one page within the chunk. */ -typedef struct arena_chunk_map_s arena_chunk_map_t; -struct arena_chunk_map_s { - /* - * Linkage for run trees. There are two disjoint uses: - * - * 1) arena_t's runs_avail tree. - * 2) arena_run_t conceptually uses this linkage for in-use non-full - * runs, rather than directly embedding linkage. - */ - rb_node(arena_chunk_map_t) link; - - /* - * Run address (or size) and various flags are stored together. The bit - * layout looks like (assuming 32-bit system): - * - * ???????? ???????? ????---- ---kdzla - * - * ? : Unallocated: Run address for first/last pages, unset for internal - * pages. - * Small: Run address. - * Large: Run size for first page, unset for trailing pages. - * - : Unused. - * k : key? - * d : dirty? - * z : zeroed? - * l : large? - * a : allocated? - * - * Following are example bit patterns for the three types of runs. - * - * r : run address - * s : run size - * x : don't care - * - : 0 - * [dzla] : bit set - * - * Unallocated: - * ssssssss ssssssss ssss---- -------- - * xxxxxxxx xxxxxxxx xxxx---- ----d--- - * ssssssss ssssssss ssss---- -----z-- - * - * Small: - * rrrrrrrr rrrrrrrr rrrr---- -------a - * rrrrrrrr rrrrrrrr rrrr---- -------a - * rrrrrrrr rrrrrrrr rrrr---- -------a - * - * Large: - * ssssssss ssssssss ssss---- ------la - * -------- -------- -------- ------la - * -------- -------- -------- ------la - */ - size_t bits; -#define CHUNK_MAP_KEY ((size_t)0x10U) -#define CHUNK_MAP_DIRTY ((size_t)0x08U) -#define CHUNK_MAP_ZEROED ((size_t)0x04U) -#define CHUNK_MAP_LARGE ((size_t)0x02U) -#define CHUNK_MAP_ALLOCATED ((size_t)0x01U) -}; -typedef rb_tree(arena_chunk_map_t) arena_avail_tree_t; -typedef rb_tree(arena_chunk_map_t) arena_run_tree_t; - -/* Arena chunk header. */ -typedef struct arena_chunk_s arena_chunk_t; -struct arena_chunk_s { - /* Arena that owns the chunk. */ - arena_t *arena; - - /* Linkage for the arena's chunks_dirty tree. */ - rb_node(arena_chunk_t) link_dirty; - - /* Number of dirty pages. */ - size_t ndirty; - - /* Map of pages within chunk that keeps track of free/large/small. */ - arena_chunk_map_t map[1]; /* Dynamically sized. */ -}; -typedef rb_tree(arena_chunk_t) arena_chunk_tree_t; - -typedef struct arena_run_s arena_run_t; -struct arena_run_s { -#ifdef MALLOC_DEBUG - uint32_t magic; -# define ARENA_RUN_MAGIC 0x384adf93 -#endif - - /* Bin this run is associated with. */ - arena_bin_t *bin; - - /* Index of first element that might have a free region. */ - unsigned regs_minelm; - - /* Number of free regions in run. */ - unsigned nfree; - - /* Bitmask of in-use regions (0: in use, 1: free). */ - unsigned regs_mask[1]; /* Dynamically sized. */ -}; - -struct arena_bin_s { - /* - * Current run being used to service allocations of this bin's size - * class. - */ - arena_run_t *runcur; - - /* - * Tree of non-full runs. This tree is used when looking for an - * existing run when runcur is no longer usable. We choose the - * non-full run that is lowest in memory; this policy tends to keep - * objects packed well, and it can also help reduce the number of - * almost-empty chunks. - */ - arena_run_tree_t runs; - - /* Size of regions in a run for this bin's size class. */ - size_t reg_size; - - /* Total size of a run for this bin's size class. */ - size_t run_size; - - /* Total number of regions in a run for this bin's size class. */ - uint32_t nregs; - - /* Number of elements in a run's regs_mask for this bin's size class. */ - uint32_t regs_mask_nelms; - - /* Offset of first region in a run for this bin's size class. */ - uint32_t reg0_offset; - -#ifdef MALLOC_STATS - /* Bin statistics. */ - malloc_bin_stats_t stats; -#endif -}; - -struct arena_s { -#ifdef MALLOC_DEBUG - uint32_t magic; -# define ARENA_MAGIC 0x947d3d24 -#endif - - /* All operations on this arena require that lock be locked. */ - pthread_mutex_t lock; - -#ifdef MALLOC_STATS - arena_stats_t stats; -#endif - - /* Tree of dirty-page-containing chunks this arena manages. */ - arena_chunk_tree_t chunks_dirty; - - /* - * In order to avoid rapid chunk allocation/deallocation when an arena - * oscillates right on the cusp of needing a new chunk, cache the most - * recently freed chunk. The spare is left in the arena's chunk trees - * until it is deleted. - * - * There is one spare chunk per arena, rather than one spare total, in - * order to avoid interactions between multiple threads that could make - * a single spare inadequate. - */ - arena_chunk_t *spare; - - /* - * Current count of pages within unused runs that are potentially - * dirty, and for which madvise(... MADV_DONTNEED) has not been called. - * By tracking this, we can institute a limit on how much dirty unused - * memory is mapped for each arena. - */ - size_t ndirty; - - /* - * Size/address-ordered tree of this arena's available runs. This tree - * is used for first-best-fit run allocation. - */ - arena_avail_tree_t runs_avail; - -#ifdef MALLOC_BALANCE - /* - * The arena load balancing machinery needs to keep track of how much - * lock contention there is. This value is exponentially averaged. - */ - uint32_t contention; -#endif - - /* - * bins is used to store rings of free regions of the following sizes, - * assuming a 16-byte quantum, 4kB pagesize, and default MALLOC_OPTIONS. - * - * bins[i] | size | - * --------+------+ - * 0 | 2 | - * 1 | 4 | - * 2 | 8 | - * --------+------+ - * 3 | 16 | - * 4 | 32 | - * 5 | 48 | - * 6 | 64 | - * : : - * : : - * 33 | 496 | - * 34 | 512 | - * --------+------+ - * 35 | 1024 | - * 36 | 2048 | - * --------+------+ - */ - arena_bin_t bins[1]; /* Dynamically sized. */ -}; - -/******************************************************************************/ -/* - * Magazine data structures. - */ - -#ifdef MALLOC_MAG -typedef struct mag_s mag_t; -struct mag_s { - size_t binind; /* Index of associated bin. */ - size_t nrounds; - void *rounds[1]; /* Dynamically sized. */ -}; - -/* - * Magazines are lazily allocated, but once created, they remain until the - * associated mag_rack is destroyed. - */ -typedef struct bin_mags_s bin_mags_t; -struct bin_mags_s { - mag_t *curmag; - mag_t *sparemag; -}; - -typedef struct mag_rack_s mag_rack_t; -struct mag_rack_s { - bin_mags_t bin_mags[1]; /* Dynamically sized. */ -}; -#endif - -/******************************************************************************/ -/* - * Data. - */ - -/* Number of CPUs. */ -static unsigned ncpus; - -/* VM page size. */ -static size_t pagesize; -static size_t pagesize_mask; -static size_t pagesize_2pow; - -/* Various bin-related settings. */ -#ifdef MALLOC_TINY /* Number of (2^n)-spaced tiny bins. */ -# define ntbins ((unsigned)(QUANTUM_2POW - TINY_MIN_2POW)) -#else -# define ntbins 0 -#endif -static unsigned nqbins; /* Number of quantum-spaced bins. */ -static unsigned ncbins; /* Number of cacheline-spaced bins. */ -static unsigned nsbins; /* Number of subpage-spaced bins. */ -static unsigned nbins; -#ifdef MALLOC_TINY -# define tspace_max ((size_t)(QUANTUM >> 1)) -#endif -#define qspace_min QUANTUM -static size_t qspace_max; -static size_t cspace_min; -static size_t cspace_max; -static size_t sspace_min; -static size_t sspace_max; -#define bin_maxclass sspace_max - -static uint8_t const *size2bin; -/* - * const_size2bin is a static constant lookup table that in the common case can - * be used as-is for size2bin. For dynamically linked programs, this avoids - * a page of memory overhead per process. - */ -#define S2B_1(i) i, -#define S2B_2(i) S2B_1(i) S2B_1(i) -#define S2B_4(i) S2B_2(i) S2B_2(i) -#define S2B_8(i) S2B_4(i) S2B_4(i) -#define S2B_16(i) S2B_8(i) S2B_8(i) -#define S2B_32(i) S2B_16(i) S2B_16(i) -#define S2B_64(i) S2B_32(i) S2B_32(i) -#define S2B_128(i) S2B_64(i) S2B_64(i) -#define S2B_256(i) S2B_128(i) S2B_128(i) -static const uint8_t const_size2bin[(1U << PAGESIZE_2POW) - 255] = { - S2B_1(0xffU) /* 0 */ -#if (QUANTUM_2POW == 4) -/* 64-bit system ************************/ -# ifdef MALLOC_TINY - S2B_2(0) /* 2 */ - S2B_2(1) /* 4 */ - S2B_4(2) /* 8 */ - S2B_8(3) /* 16 */ -# define S2B_QMIN 3 -# else - S2B_16(0) /* 16 */ -# define S2B_QMIN 0 -# endif - S2B_16(S2B_QMIN + 1) /* 32 */ - S2B_16(S2B_QMIN + 2) /* 48 */ - S2B_16(S2B_QMIN + 3) /* 64 */ - S2B_16(S2B_QMIN + 4) /* 80 */ - S2B_16(S2B_QMIN + 5) /* 96 */ - S2B_16(S2B_QMIN + 6) /* 112 */ - S2B_16(S2B_QMIN + 7) /* 128 */ -# define S2B_CMIN (S2B_QMIN + 8) -#else -/* 32-bit system ************************/ -# ifdef MALLOC_TINY - S2B_2(0) /* 2 */ - S2B_2(1) /* 4 */ - S2B_4(2) /* 8 */ -# define S2B_QMIN 2 -# else - S2B_8(0) /* 8 */ -# define S2B_QMIN 0 -# endif - S2B_8(S2B_QMIN + 1) /* 16 */ - S2B_8(S2B_QMIN + 2) /* 24 */ - S2B_8(S2B_QMIN + 3) /* 32 */ - S2B_8(S2B_QMIN + 4) /* 40 */ - S2B_8(S2B_QMIN + 5) /* 48 */ - S2B_8(S2B_QMIN + 6) /* 56 */ - S2B_8(S2B_QMIN + 7) /* 64 */ - S2B_8(S2B_QMIN + 8) /* 72 */ - S2B_8(S2B_QMIN + 9) /* 80 */ - S2B_8(S2B_QMIN + 10) /* 88 */ - S2B_8(S2B_QMIN + 11) /* 96 */ - S2B_8(S2B_QMIN + 12) /* 104 */ - S2B_8(S2B_QMIN + 13) /* 112 */ - S2B_8(S2B_QMIN + 14) /* 120 */ - S2B_8(S2B_QMIN + 15) /* 128 */ -# define S2B_CMIN (S2B_QMIN + 16) -#endif -/****************************************/ - S2B_64(S2B_CMIN + 0) /* 192 */ - S2B_64(S2B_CMIN + 1) /* 256 */ - S2B_64(S2B_CMIN + 2) /* 320 */ - S2B_64(S2B_CMIN + 3) /* 384 */ - S2B_64(S2B_CMIN + 4) /* 448 */ - S2B_64(S2B_CMIN + 5) /* 512 */ -# define S2B_SMIN (S2B_CMIN + 6) - S2B_256(S2B_SMIN + 0) /* 768 */ - S2B_256(S2B_SMIN + 1) /* 1024 */ - S2B_256(S2B_SMIN + 2) /* 1280 */ - S2B_256(S2B_SMIN + 3) /* 1536 */ - S2B_256(S2B_SMIN + 4) /* 1792 */ - S2B_256(S2B_SMIN + 5) /* 2048 */ - S2B_256(S2B_SMIN + 6) /* 2304 */ - S2B_256(S2B_SMIN + 7) /* 2560 */ - S2B_256(S2B_SMIN + 8) /* 2816 */ - S2B_256(S2B_SMIN + 9) /* 3072 */ - S2B_256(S2B_SMIN + 10) /* 3328 */ - S2B_256(S2B_SMIN + 11) /* 3584 */ - S2B_256(S2B_SMIN + 12) /* 3840 */ -#if (PAGESIZE_2POW == 13) - S2B_256(S2B_SMIN + 13) /* 4096 */ - S2B_256(S2B_SMIN + 14) /* 4352 */ - S2B_256(S2B_SMIN + 15) /* 4608 */ - S2B_256(S2B_SMIN + 16) /* 4864 */ - S2B_256(S2B_SMIN + 17) /* 5120 */ - S2B_256(S2B_SMIN + 18) /* 5376 */ - S2B_256(S2B_SMIN + 19) /* 5632 */ - S2B_256(S2B_SMIN + 20) /* 5888 */ - S2B_256(S2B_SMIN + 21) /* 6144 */ - S2B_256(S2B_SMIN + 22) /* 6400 */ - S2B_256(S2B_SMIN + 23) /* 6656 */ - S2B_256(S2B_SMIN + 24) /* 6912 */ - S2B_256(S2B_SMIN + 25) /* 7168 */ - S2B_256(S2B_SMIN + 26) /* 7424 */ - S2B_256(S2B_SMIN + 27) /* 7680 */ - S2B_256(S2B_SMIN + 28) /* 7936 */ -#endif -}; -#undef S2B_1 -#undef S2B_2 -#undef S2B_4 -#undef S2B_8 -#undef S2B_16 -#undef S2B_32 -#undef S2B_64 -#undef S2B_128 -#undef S2B_256 -#undef S2B_QMIN -#undef S2B_CMIN -#undef S2B_SMIN - -#ifdef MALLOC_MAG -static size_t max_rounds; -#endif - -/* Various chunk-related settings. */ -static size_t chunksize; -static size_t chunksize_mask; /* (chunksize - 1). */ -static size_t chunk_npages; -static size_t arena_chunk_header_npages; -static size_t arena_maxclass; /* Max size class for arenas. */ - -/********/ -/* - * Chunks. - */ - -/* Protects chunk-related data structures. */ -static malloc_mutex_t huge_mtx; - -/* Tree of chunks that are stand-alone huge allocations. */ -static extent_tree_t huge; - -#ifdef MALLOC_DSS -/* - * Protects sbrk() calls. This avoids malloc races among threads, though it - * does not protect against races with threads that call sbrk() directly. - */ -static malloc_mutex_t dss_mtx; -/* Base address of the DSS. */ -static void *dss_base; -/* Current end of the DSS, or ((void *)-1) if the DSS is exhausted. */ -static void *dss_prev; -/* Current upper limit on DSS addresses. */ -static void *dss_max; - -/* - * Trees of chunks that were previously allocated (trees differ only in node - * ordering). These are used when allocating chunks, in an attempt to re-use - * address space. Depending on function, different tree orderings are needed, - * which is why there are two trees with the same contents. - */ -static extent_tree_t dss_chunks_szad; -static extent_tree_t dss_chunks_ad; -#endif - -#ifdef MALLOC_STATS -/* Huge allocation statistics. */ -static uint64_t huge_nmalloc; -static uint64_t huge_ndalloc; -static size_t huge_allocated; -#endif - -/****************************/ -/* - * base (internal allocation). - */ - -/* - * Current pages that are being used for internal memory allocations. These - * pages are carved up in cacheline-size quanta, so that there is no chance of - * false cache line sharing. - */ -static void *base_pages; -static void *base_next_addr; -static void *base_past_addr; /* Addr immediately past base_pages. */ -static extent_node_t *base_nodes; -static malloc_mutex_t base_mtx; -#ifdef MALLOC_STATS -static size_t base_mapped; -#endif - -/********/ -/* - * Arenas. - */ - -/* - * Arenas that are used to service external requests. Not all elements of the - * arenas array are necessarily used; arenas are created lazily as needed. - */ -static arena_t **arenas; -static unsigned narenas; -#ifndef NO_TLS -# ifdef MALLOC_BALANCE -static unsigned narenas_2pow; -# else -static unsigned next_arena; -# endif -#endif -static pthread_mutex_t arenas_lock; /* Protects arenas initialization. */ - -#ifndef NO_TLS -/* - * Map of pthread_self() --> arenas[???], used for selecting an arena to use - * for allocations. - */ -static __thread arena_t *arenas_map; -#endif - -#ifdef MALLOC_MAG -/* - * Map of thread-specific magazine racks, used for thread-specific object - * caching. - */ -static __thread mag_rack_t *mag_rack; -#endif - -#ifdef MALLOC_STATS -/* Chunk statistics. */ -static chunk_stats_t stats_chunks; -#endif - -/*******************************/ -/* - * Runtime configuration options. - */ -const char *_malloc_options; - -#ifndef MALLOC_PRODUCTION -static bool opt_abort = true; -static bool opt_junk = true; -#else -static bool opt_abort = false; -static bool opt_junk = false; -#endif -#ifdef MALLOC_DSS -static bool opt_dss = true; -static bool opt_mmap = true; -#endif -#ifdef MALLOC_MAG -static bool opt_mag = true; -static size_t opt_mag_size_2pow = MAG_SIZE_2POW_DEFAULT; -#endif -static size_t opt_dirty_max = DIRTY_MAX_DEFAULT; -#ifdef MALLOC_BALANCE -static uint64_t opt_balance_threshold = BALANCE_THRESHOLD_DEFAULT; -#endif -static bool opt_print_stats = false; -static size_t opt_qspace_max_2pow = QSPACE_MAX_2POW_DEFAULT; -static size_t opt_cspace_max_2pow = CSPACE_MAX_2POW_DEFAULT; -static size_t opt_chunk_2pow = CHUNK_2POW_DEFAULT; -static bool opt_utrace = false; -static bool opt_sysv = false; -static bool opt_xmalloc = false; -static bool opt_zero = false; -static int opt_narenas_lshift = 0; - -typedef struct { - void *p; - size_t s; - void *r; -} malloc_utrace_t; - -#ifdef MALLOC_STATS -#define UTRACE(a, b, c) \ - if (opt_utrace) { \ - malloc_utrace_t ut; \ - ut.p = (a); \ - ut.s = (b); \ - ut.r = (c); \ - utrace(&ut, sizeof(ut)); \ - } -#else -#define UTRACE(a, b, c) -#endif - -/******************************************************************************/ -/* - * Begin function prototypes for non-inline static functions. - */ - -static bool malloc_mutex_init(malloc_mutex_t *mutex); -static bool malloc_spin_init(pthread_mutex_t *lock); -static void wrtmessage(const char *p1, const char *p2, const char *p3, - const char *p4); -#ifdef MALLOC_STATS -static void malloc_printf(const char *format, ...); -#endif -static char *umax2s(uintmax_t x, char *s); -#ifdef MALLOC_DSS -static bool base_pages_alloc_dss(size_t minsize); -#endif -static bool base_pages_alloc_mmap(size_t minsize); -static bool base_pages_alloc(size_t minsize); -static void *base_alloc(size_t size); -static extent_node_t *base_node_alloc(void); -static void base_node_dealloc(extent_node_t *node); -#ifdef MALLOC_STATS -static void stats_print(arena_t *arena); -#endif -static void *pages_map(void *addr, size_t size); -static void pages_unmap(void *addr, size_t size); -#ifdef MALLOC_DSS -static void *chunk_alloc_dss(size_t size); -static void *chunk_recycle_dss(size_t size, bool zero); -#endif -static void *chunk_alloc_mmap(size_t size); -static void *chunk_alloc(size_t size, bool zero); -#ifdef MALLOC_DSS -static extent_node_t *chunk_dealloc_dss_record(void *chunk, size_t size); -static bool chunk_dealloc_dss(void *chunk, size_t size); -#endif -static void chunk_dealloc_mmap(void *chunk, size_t size); -static void chunk_dealloc(void *chunk, size_t size); -#ifndef NO_TLS -static arena_t *choose_arena_hard(void); -#endif -static void arena_run_split(arena_t *arena, arena_run_t *run, size_t size, - bool large, bool zero); -static arena_chunk_t *arena_chunk_alloc(arena_t *arena); -static void arena_chunk_dealloc(arena_t *arena, arena_chunk_t *chunk); -static arena_run_t *arena_run_alloc(arena_t *arena, size_t size, bool large, - bool zero); -static void arena_purge(arena_t *arena); -static void arena_run_dalloc(arena_t *arena, arena_run_t *run, bool dirty); -static void arena_run_trim_head(arena_t *arena, arena_chunk_t *chunk, - arena_run_t *run, size_t oldsize, size_t newsize); -static void arena_run_trim_tail(arena_t *arena, arena_chunk_t *chunk, - arena_run_t *run, size_t oldsize, size_t newsize, bool dirty); -static arena_run_t *arena_bin_nonfull_run_get(arena_t *arena, arena_bin_t *bin); -static void *arena_bin_malloc_hard(arena_t *arena, arena_bin_t *bin); -static size_t arena_bin_run_size_calc(arena_bin_t *bin, size_t min_run_size); -#ifdef MALLOC_BALANCE -static void arena_lock_balance_hard(arena_t *arena); -#endif -#ifdef MALLOC_MAG -static void mag_load(mag_t *mag); -#endif -static void *arena_malloc_large(arena_t *arena, size_t size, bool zero); -static void *arena_palloc(arena_t *arena, size_t alignment, size_t size, - size_t alloc_size); -static size_t arena_salloc(const void *ptr); -#ifdef MALLOC_MAG -static void mag_unload(mag_t *mag); -#endif -static void arena_dalloc_large(arena_t *arena, arena_chunk_t *chunk, - void *ptr); -static void arena_ralloc_large_shrink(arena_t *arena, arena_chunk_t *chunk, - void *ptr, size_t size, size_t oldsize); -static bool arena_ralloc_large_grow(arena_t *arena, arena_chunk_t *chunk, - void *ptr, size_t size, size_t oldsize); -static bool arena_ralloc_large(void *ptr, size_t size, size_t oldsize); -static void *arena_ralloc(void *ptr, size_t size, size_t oldsize); -static bool arena_new(arena_t *arena); -static arena_t *arenas_extend(unsigned ind); -#ifdef MALLOC_MAG -static mag_t *mag_create(arena_t *arena, size_t binind); -static void mag_destroy(mag_t *mag); -static mag_rack_t *mag_rack_create(arena_t *arena); -static void mag_rack_destroy(mag_rack_t *rack); -#endif -static void *huge_malloc(size_t size, bool zero); -static void *huge_palloc(size_t alignment, size_t size); -static void *huge_ralloc(void *ptr, size_t size, size_t oldsize); -static void huge_dalloc(void *ptr); -static void malloc_print_stats(void); -#ifdef MALLOC_DEBUG -static void size2bin_validate(void); -#endif -static bool size2bin_init(void); -static bool size2bin_init_hard(void); -static unsigned malloc_ncpus(void); -static bool malloc_init_hard(void); -void _malloc_prefork(void); -void _malloc_postfork(void); - -/* - * End function prototypes. - */ -/******************************************************************************/ - -/* - * Functions missing prototypes which caused -Werror to fail. - * Not sure if it has any side effects. - * */ -size_t malloc_usable_size(const void *ptr); -void _malloc_thread_cleanup(void); - - -static void -wrtmessage(const char *p1, const char *p2, const char *p3, const char *p4) -{ - - write(STDERR_FILENO, p1, strlen(p1)); - write(STDERR_FILENO, p2, strlen(p2)); - write(STDERR_FILENO, p3, strlen(p3)); - write(STDERR_FILENO, p4, strlen(p4)); -} - -#define _malloc_message malloc_message -void (*_malloc_message)(const char *p1, const char *p2, const char *p3, - const char *p4) = wrtmessage; - -/* - * We don't want to depend on vsnprintf() for production builds, since that can - * cause unnecessary bloat for static binaries. umax2s() provides minimal - * integer printing functionality, so that malloc_printf() use can be limited to - * MALLOC_STATS code. - */ -#define UMAX2S_BUFSIZE 21 -static char * -umax2s(uintmax_t x, char *s) -{ - unsigned i; - - i = UMAX2S_BUFSIZE - 1; - s[i] = '\0'; - do { - i--; - s[i] = "0123456789"[x % 10]; - x /= 10; - } while (x > 0); - - return (&s[i]); -} - -/* - * Define a custom assert() in order to reduce the chances of deadlock during - * assertion failure. - */ -#ifdef MALLOC_DEBUG -# define assert(e) do { \ - if (!(e)) { \ - char line_buf[UMAX2S_BUFSIZE]; \ - _malloc_message(__FILE__, ":", umax2s(__LINE__, \ - line_buf), ": Failed assertion: "); \ - _malloc_message("\"", #e, "\"\n", ""); \ - abort(); \ - } \ -} while (0) -#else -#define assert(e) -#endif - -#ifdef MALLOC_STATS -static int -utrace(const void *addr, size_t len) -{ - malloc_utrace_t *ut = (malloc_utrace_t *)addr; - - assert(len == sizeof(malloc_utrace_t)); - - if (ut->p == NULL && ut->s == 0 && ut->r == NULL) - malloc_printf("%d x USER malloc_init()\n", getpid()); - else if (ut->p == NULL && ut->r != NULL) { - malloc_printf("%d x USER %p = malloc(%zu)\n", getpid(), ut->r, - ut->s); - } else if (ut->p != NULL && ut->r != NULL) { - malloc_printf("%d x USER %p = realloc(%p, %zu)\n", getpid(), - ut->r, ut->p, ut->s); - } else - malloc_printf("%d x USER free(%p)\n", getpid(), ut->p); - - return (0); -} -#endif - -static inline const char * -_getprogname(void) -{ - - return (""); -} - -#ifdef MALLOC_STATS -/* - * Print to stderr in such a way as to (hopefully) avoid memory allocation. - */ -static void -malloc_printf(const char *format, ...) -{ - char buf[4096]; - va_list ap; - - va_start(ap, format); - vsnprintf(buf, sizeof(buf), format, ap); - va_end(ap); - _malloc_message(buf, "", "", ""); -} -#endif - -/******************************************************************************/ -/* - * Begin mutex. - */ - -static bool -malloc_mutex_init(malloc_mutex_t *mutex) -{ - pthread_mutexattr_t attr; - - if (pthread_mutexattr_init(&attr) != 0) - return (true); - pthread_mutexattr_settype(&attr, PTHREAD_MUTEX_ADAPTIVE_NP); - if (pthread_mutex_init(mutex, &attr) != 0) { - pthread_mutexattr_destroy(&attr); - return (true); - } - pthread_mutexattr_destroy(&attr); - - return (false); -} - -static inline void -malloc_mutex_lock(malloc_mutex_t *mutex) -{ - - if (__isthreaded) - pthread_mutex_lock(mutex); -} - -static inline void -malloc_mutex_unlock(malloc_mutex_t *mutex) -{ - - if (__isthreaded) - pthread_mutex_unlock(mutex); -} - -/* - * End mutex. - */ -/******************************************************************************/ -/* - * Begin spin lock. Spin locks here are actually adaptive mutexes that block - * after a period of spinning, because unbounded spinning would allow for - * priority inversion. - */ - -static bool -malloc_spin_init(pthread_mutex_t *lock) -{ - - if (pthread_mutex_init(lock, NULL) != 0) - return (true); - - return (false); -} - -static inline unsigned -malloc_spin_lock(pthread_mutex_t *lock) -{ - unsigned ret = 0; - - if (__isthreaded) { - if (pthread_mutex_trylock(lock) != 0) { - unsigned i; - volatile unsigned j; - - /* Exponentially back off. */ - for (i = 1; i <= SPIN_LIMIT_2POW; i++) { - for (j = 0; j < (1U << i); j++) { - ret++; - CPU_SPINWAIT; - } - - if (pthread_mutex_trylock(lock) == 0) - return (ret); - } - - /* - * Spinning failed. Block until the lock becomes - * available, in order to avoid indefinite priority - * inversion. - */ - pthread_mutex_lock(lock); - assert((ret << BLOCK_COST_2POW) != 0); - return (ret << BLOCK_COST_2POW); - } - } - - return (ret); -} - -static inline void -malloc_spin_unlock(pthread_mutex_t *lock) -{ - - if (__isthreaded) - pthread_mutex_unlock(lock); -} - -/* - * End spin lock. - */ -/******************************************************************************/ -/* - * Begin Utility functions/macros. - */ - -/* Return the chunk address for allocation address a. */ -#define CHUNK_ADDR2BASE(a) \ - ((void *)((uintptr_t)(a) & ~chunksize_mask)) - -/* Return the chunk offset of address a. */ -#define CHUNK_ADDR2OFFSET(a) \ - ((size_t)((uintptr_t)(a) & chunksize_mask)) - -/* Return the smallest chunk multiple that is >= s. */ -#define CHUNK_CEILING(s) \ - (((s) + chunksize_mask) & ~chunksize_mask) - -/* Return the smallest quantum multiple that is >= a. */ -#define QUANTUM_CEILING(a) \ - (((a) + QUANTUM_MASK) & ~QUANTUM_MASK) - -/* Return the smallest cacheline multiple that is >= s. */ -#define CACHELINE_CEILING(s) \ - (((s) + CACHELINE_MASK) & ~CACHELINE_MASK) - -/* Return the smallest subpage multiple that is >= s. */ -#define SUBPAGE_CEILING(s) \ - (((s) + SUBPAGE_MASK) & ~SUBPAGE_MASK) - -/* Return the smallest pagesize multiple that is >= s. */ -#define PAGE_CEILING(s) \ - (((s) + pagesize_mask) & ~pagesize_mask) - -#ifdef MALLOC_TINY -/* Compute the smallest power of 2 that is >= x. */ -static inline size_t -pow2_ceil(size_t x) -{ - - x--; - x |= x >> 1; - x |= x >> 2; - x |= x >> 4; - x |= x >> 8; - x |= x >> 16; -#if (SIZEOF_PTR == 8) - x |= x >> 32; -#endif - x++; - return (x); -} -#endif - -#ifdef MALLOC_BALANCE -/* - * Use a simple linear congruential pseudo-random number generator: - * - * prn(y) = (a*x + c) % m - * - * where the following constants ensure maximal period: - * - * a == Odd number (relatively prime to 2^n), and (a-1) is a multiple of 4. - * c == Odd number (relatively prime to 2^n). - * m == 2^32 - * - * See Knuth's TAOCP 3rd Ed., Vol. 2, pg. 17 for details on these constraints. - * - * This choice of m has the disadvantage that the quality of the bits is - * proportional to bit position. For example. the lowest bit has a cycle of 2, - * the next has a cycle of 4, etc. For this reason, we prefer to use the upper - * bits. - */ -# define PRN_DEFINE(suffix, var, a, c) \ -static inline void \ -sprn_##suffix(uint32_t seed) \ -{ \ - var = seed; \ -} \ - \ -static inline uint32_t \ -prn_##suffix(uint32_t lg_range) \ -{ \ - uint32_t ret, x; \ - \ - assert(lg_range > 0); \ - assert(lg_range <= 32); \ - \ - x = (var * (a)) + (c); \ - var = x; \ - ret = x >> (32 - lg_range); \ - \ - return (ret); \ -} -# define SPRN(suffix, seed) sprn_##suffix(seed) -# define PRN(suffix, lg_range) prn_##suffix(lg_range) -#endif - -#ifdef MALLOC_BALANCE -/* Define the PRNG used for arena assignment. */ -static __thread uint32_t balance_x; -PRN_DEFINE(balance, balance_x, 1297, 1301) -#endif - -/******************************************************************************/ - -#ifdef MALLOC_DSS -static bool -base_pages_alloc_dss(size_t minsize) -{ - - /* - * Do special DSS allocation here, since base allocations don't need to - * be chunk-aligned. - */ - malloc_mutex_lock(&dss_mtx); - if (dss_prev != (void *)-1) { - intptr_t incr; - size_t csize = CHUNK_CEILING(minsize); - - do { - /* Get the current end of the DSS. */ - dss_max = sbrk(0); - - /* - * Calculate how much padding is necessary to - * chunk-align the end of the DSS. Don't worry about - * dss_max not being chunk-aligned though. - */ - incr = (intptr_t)chunksize - - (intptr_t)CHUNK_ADDR2OFFSET(dss_max); - assert(incr >= 0); - if ((size_t)incr < minsize) - incr += csize; - - dss_prev = sbrk(incr); - if (dss_prev == dss_max) { - /* Success. */ - dss_max = (void *)((intptr_t)dss_prev + incr); - base_pages = dss_prev; - base_next_addr = base_pages; - base_past_addr = dss_max; -#ifdef MALLOC_STATS - base_mapped += incr; -#endif - malloc_mutex_unlock(&dss_mtx); - return (false); - } - } while (dss_prev != (void *)-1); - } - malloc_mutex_unlock(&dss_mtx); - - return (true); -} -#endif - -static bool -base_pages_alloc_mmap(size_t minsize) -{ - size_t csize; - - assert(minsize != 0); - csize = PAGE_CEILING(minsize); - base_pages = pages_map(NULL, csize); - if (base_pages == NULL) - return (true); - base_next_addr = base_pages; - base_past_addr = (void *)((uintptr_t)base_pages + csize); -#ifdef MALLOC_STATS - base_mapped += csize; -#endif - - return (false); -} - -static bool -base_pages_alloc(size_t minsize) -{ - -#ifdef MALLOC_DSS - if (opt_dss) { - if (base_pages_alloc_dss(minsize) == false) - return (false); - } - - if (opt_mmap && minsize != 0) -#endif - { - if (base_pages_alloc_mmap(minsize) == false) - return (false); - } - - return (true); -} - -static void * -base_alloc(size_t size) -{ - void *ret; - size_t csize; - - /* Round size up to nearest multiple of the cacheline size. */ - csize = CACHELINE_CEILING(size); - - malloc_mutex_lock(&base_mtx); - /* Make sure there's enough space for the allocation. */ - if ((uintptr_t)base_next_addr + csize > (uintptr_t)base_past_addr) { - if (base_pages_alloc(csize)) { - malloc_mutex_unlock(&base_mtx); - return (NULL); - } - } - /* Allocate. */ - ret = base_next_addr; - base_next_addr = (void *)((uintptr_t)base_next_addr + csize); - malloc_mutex_unlock(&base_mtx); - - return (ret); -} - -static extent_node_t * -base_node_alloc(void) -{ - extent_node_t *ret; - - malloc_mutex_lock(&base_mtx); - if (base_nodes != NULL) { - ret = base_nodes; - base_nodes = *(extent_node_t **)ret; - malloc_mutex_unlock(&base_mtx); - } else { - malloc_mutex_unlock(&base_mtx); - ret = (extent_node_t *)base_alloc(sizeof(extent_node_t)); - } - - return (ret); -} - -static void -base_node_dealloc(extent_node_t *node) -{ - - malloc_mutex_lock(&base_mtx); - *(extent_node_t **)node = base_nodes; - base_nodes = node; - malloc_mutex_unlock(&base_mtx); -} - -/******************************************************************************/ - -#ifdef MALLOC_STATS -static void -stats_print(arena_t *arena) -{ - unsigned i, gap_start; - - malloc_printf("dirty: %zu page%s dirty, %llu sweep%s," - " %llu madvise%s, %llu page%s purged\n", - arena->ndirty, arena->ndirty == 1 ? "" : "s", - arena->stats.npurge, arena->stats.npurge == 1 ? "" : "s", - arena->stats.nmadvise, arena->stats.nmadvise == 1 ? "" : "s", - arena->stats.purged, arena->stats.purged == 1 ? "" : "s"); - - malloc_printf(" allocated nmalloc ndalloc\n"); - malloc_printf("small: %12zu %12llu %12llu\n", - arena->stats.allocated_small, arena->stats.nmalloc_small, - arena->stats.ndalloc_small); - malloc_printf("large: %12zu %12llu %12llu\n", - arena->stats.allocated_large, arena->stats.nmalloc_large, - arena->stats.ndalloc_large); - malloc_printf("total: %12zu %12llu %12llu\n", - arena->stats.allocated_small + arena->stats.allocated_large, - arena->stats.nmalloc_small + arena->stats.nmalloc_large, - arena->stats.ndalloc_small + arena->stats.ndalloc_large); - malloc_printf("mapped: %12zu\n", arena->stats.mapped); - -#ifdef MALLOC_MAG - if (__isthreaded && opt_mag) { - malloc_printf("bins: bin size regs pgs mags " - "newruns reruns maxruns curruns\n"); - } else { -#endif - malloc_printf("bins: bin size regs pgs requests " - "newruns reruns maxruns curruns\n"); -#ifdef MALLOC_MAG - } -#endif - for (i = 0, gap_start = UINT_MAX; i < nbins; i++) { - if (arena->bins[i].stats.nruns == 0) { - if (gap_start == UINT_MAX) - gap_start = i; - } else { - if (gap_start != UINT_MAX) { - if (i > gap_start + 1) { - /* Gap of more than one size class. */ - malloc_printf("[%u..%u]\n", - gap_start, i - 1); - } else { - /* Gap of one size class. */ - malloc_printf("[%u]\n", gap_start); - } - gap_start = UINT_MAX; - } - malloc_printf( - "%13u %1s %4u %4u %3u %9llu %9llu" - " %9llu %7lu %7lu\n", - i, - i < ntbins ? "T" : i < ntbins + nqbins ? "Q" : - i < ntbins + nqbins + ncbins ? "C" : "S", - arena->bins[i].reg_size, - arena->bins[i].nregs, - arena->bins[i].run_size >> pagesize_2pow, -#ifdef MALLOC_MAG - (__isthreaded && opt_mag) ? - arena->bins[i].stats.nmags : -#endif - arena->bins[i].stats.nrequests, - arena->bins[i].stats.nruns, - arena->bins[i].stats.reruns, - arena->bins[i].stats.highruns, - arena->bins[i].stats.curruns); - } - } - if (gap_start != UINT_MAX) { - if (i > gap_start + 1) { - /* Gap of more than one size class. */ - malloc_printf("[%u..%u]\n", gap_start, i - 1); - } else { - /* Gap of one size class. */ - malloc_printf("[%u]\n", gap_start); - } - } -} -#endif - -/* - * End Utility functions/macros. - */ -/******************************************************************************/ -/* - * Begin extent tree code. - */ - -#ifdef MALLOC_DSS -static inline int -extent_szad_comp(extent_node_t *a, extent_node_t *b) -{ - int ret; - size_t a_size = a->size; - size_t b_size = b->size; - - ret = (a_size > b_size) - (a_size < b_size); - if (ret == 0) { - uintptr_t a_addr = (uintptr_t)a->addr; - uintptr_t b_addr = (uintptr_t)b->addr; - - ret = (a_addr > b_addr) - (a_addr < b_addr); - } - - return (ret); -} - -/* Wrap red-black tree macros in functions. */ -rb_wrap(static ATTRIBUTE_UNUSED, extent_tree_szad_, extent_tree_t, extent_node_t, - link_szad, extent_szad_comp) -#endif - -static inline int -extent_ad_comp(extent_node_t *a, extent_node_t *b) -{ - uintptr_t a_addr = (uintptr_t)a->addr; - uintptr_t b_addr = (uintptr_t)b->addr; - - return ((a_addr > b_addr) - (a_addr < b_addr)); -} - -/* Wrap red-black tree macros in functions. */ -rb_wrap(static ATTRIBUTE_UNUSED, extent_tree_ad_, extent_tree_t, extent_node_t, link_ad, - extent_ad_comp) - -/* - * End extent tree code. - */ -/******************************************************************************/ -/* - * Begin chunk management functions. - */ - -static void * -pages_map(void *addr, size_t size) -{ - void *ret; - - /* - * We don't use MAP_FIXED here, because it can cause the *replacement* - * of existing mappings, and we only want to create new mappings. - */ - ret = mmap(addr, size, PROT_READ | PROT_WRITE, MAP_PRIVATE | MAP_ANON, - -1, 0); - assert(ret != NULL); - - if (ret == MAP_FAILED) - ret = NULL; - else if (addr != NULL && ret != addr) { - /* - * We succeeded in mapping memory, but not in the right place. - */ - if (munmap(ret, size) == -1) { - char buf[STRERROR_BUF]; - - strerror_r(errno, buf, sizeof(buf)); - _malloc_message(_getprogname(), - ": (malloc) Error in munmap(): ", buf, "\n"); - if (opt_abort) - abort(); - } - ret = NULL; - } - - assert(ret == NULL || (addr == NULL && ret != addr) - || (addr != NULL && ret == addr)); - return (ret); -} - -static void -pages_unmap(void *addr, size_t size) -{ - - if (munmap(addr, size) == -1) { - char buf[STRERROR_BUF]; - - strerror_r(errno, buf, sizeof(buf)); - _malloc_message(_getprogname(), - ": (malloc) Error in munmap(): ", buf, "\n"); - if (opt_abort) - abort(); - } -} - -#ifdef MALLOC_DSS -static void * -chunk_alloc_dss(size_t size) -{ - - /* - * sbrk() uses a signed increment argument, so take care not to - * interpret a huge allocation request as a negative increment. - */ - if ((intptr_t)size < 0) - return (NULL); - - malloc_mutex_lock(&dss_mtx); - if (dss_prev != (void *)-1) { - intptr_t incr; - - /* - * The loop is necessary to recover from races with other - * threads that are using the DSS for something other than - * malloc. - */ - do { - void *ret; - - /* Get the current end of the DSS. */ - dss_max = sbrk(0); - - /* - * Calculate how much padding is necessary to - * chunk-align the end of the DSS. - */ - incr = (intptr_t)size - - (intptr_t)CHUNK_ADDR2OFFSET(dss_max); - if (incr == (intptr_t)size) - ret = dss_max; - else { - ret = (void *)((intptr_t)dss_max + incr); - incr += size; - } - - dss_prev = sbrk(incr); - if (dss_prev == dss_max) { - /* Success. */ - dss_max = (void *)((intptr_t)dss_prev + incr); - malloc_mutex_unlock(&dss_mtx); - return (ret); - } - } while (dss_prev != (void *)-1); - } - malloc_mutex_unlock(&dss_mtx); - - return (NULL); -} - -static void * -chunk_recycle_dss(size_t size, bool zero) -{ - extent_node_t *node, key; - - key.addr = NULL; - key.size = size; - malloc_mutex_lock(&dss_mtx); - node = extent_tree_szad_nsearch(&dss_chunks_szad, &key); - if (node != NULL) { - void *ret = node->addr; - - /* Remove node from the tree. */ - extent_tree_szad_remove(&dss_chunks_szad, node); - if (node->size == size) { - extent_tree_ad_remove(&dss_chunks_ad, node); - base_node_dealloc(node); - } else { - /* - * Insert the remainder of node's address range as a - * smaller chunk. Its position within dss_chunks_ad - * does not change. - */ - assert(node->size > size); - node->addr = (void *)((uintptr_t)node->addr + size); - node->size -= size; - extent_tree_szad_insert(&dss_chunks_szad, node); - } - malloc_mutex_unlock(&dss_mtx); - - if (zero) - memset(ret, 0, size); - return (ret); - } - malloc_mutex_unlock(&dss_mtx); - - return (NULL); -} -#endif - -static void * -chunk_alloc_mmap(size_t size) -{ - void *ret; - size_t offset; - - /* - * Ideally, there would be a way to specify alignment to mmap() (like - * NetBSD has), but in the absence of such a feature, we have to work - * hard to efficiently create aligned mappings. The reliable, but - * expensive method is to create a mapping that is over-sized, then - * trim the excess. However, that always results in at least one call - * to pages_unmap(). - * - * A more optimistic approach is to try mapping precisely the right - * amount, then try to append another mapping if alignment is off. In - * practice, this works out well as long as the application is not - * interleaving mappings via direct mmap() calls. If we do run into a - * situation where there is an interleaved mapping and we are unable to - * extend an unaligned mapping, our best option is to momentarily - * revert to the reliable-but-expensive method. This will tend to - * leave a gap in the memory map that is too small to cause later - * problems for the optimistic method. - */ - - ret = pages_map(NULL, size); - if (ret == NULL) - return (NULL); - - offset = CHUNK_ADDR2OFFSET(ret); - if (offset != 0) { - /* Try to extend chunk boundary. */ - if (pages_map((void *)((uintptr_t)ret + size), - chunksize - offset) == NULL) { - /* - * Extension failed. Clean up, then revert to the - * reliable-but-expensive method. - */ - pages_unmap(ret, size); - - /* Beware size_t wrap-around. */ - if (size + chunksize <= size) - return NULL; - - ret = pages_map(NULL, size + chunksize); - if (ret == NULL) - return (NULL); - - /* Clean up unneeded leading/trailing space. */ - offset = CHUNK_ADDR2OFFSET(ret); - if (offset != 0) { - /* Leading space. */ - pages_unmap(ret, chunksize - offset); - - ret = (void *)((uintptr_t)ret + - (chunksize - offset)); - - /* Trailing space. */ - pages_unmap((void *)((uintptr_t)ret + size), - offset); - } else { - /* Trailing space only. */ - pages_unmap((void *)((uintptr_t)ret + size), - chunksize); - } - } else { - /* Clean up unneeded leading space. */ - pages_unmap(ret, chunksize - offset); - ret = (void *)((uintptr_t)ret + (chunksize - offset)); - } - } - - return (ret); -} - -static void * -chunk_alloc(size_t size, bool zero) -{ - void *ret; - - (void)zero; /* XXX */ - assert(size != 0); - assert((size & chunksize_mask) == 0); - -#ifdef MALLOC_DSS - if (opt_dss) { - ret = chunk_recycle_dss(size, zero); - if (ret != NULL) { - goto RETURN; - } - - ret = chunk_alloc_dss(size); - if (ret != NULL) - goto RETURN; - } - - if (opt_mmap) -#endif - { - ret = chunk_alloc_mmap(size); - if (ret != NULL) - goto RETURN; - } - - /* All strategies for allocation failed. */ - ret = NULL; -RETURN: -#ifdef MALLOC_STATS - if (ret != NULL) { - stats_chunks.nchunks += (size / chunksize); - stats_chunks.curchunks += (size / chunksize); - } - if (stats_chunks.curchunks > stats_chunks.highchunks) - stats_chunks.highchunks = stats_chunks.curchunks; -#endif - - assert(CHUNK_ADDR2BASE(ret) == ret); - return (ret); -} - -#ifdef MALLOC_DSS -static extent_node_t * -chunk_dealloc_dss_record(void *chunk, size_t size) -{ - extent_node_t *node, *prev, key; - - key.addr = (void *)((uintptr_t)chunk + size); - node = extent_tree_ad_nsearch(&dss_chunks_ad, &key); - /* Try to coalesce forward. */ - if (node != NULL && node->addr == key.addr) { - /* - * Coalesce chunk with the following address range. This does - * not change the position within dss_chunks_ad, so only - * remove/insert from/into dss_chunks_szad. - */ - extent_tree_szad_remove(&dss_chunks_szad, node); - node->addr = chunk; - node->size += size; - extent_tree_szad_insert(&dss_chunks_szad, node); - } else { - /* - * Coalescing forward failed, so insert a new node. Drop - * dss_mtx during node allocation, since it is possible that a - * new base chunk will be allocated. - */ - malloc_mutex_unlock(&dss_mtx); - node = base_node_alloc(); - malloc_mutex_lock(&dss_mtx); - if (node == NULL) - return (NULL); - node->addr = chunk; - node->size = size; - extent_tree_ad_insert(&dss_chunks_ad, node); - extent_tree_szad_insert(&dss_chunks_szad, node); - } - - /* Try to coalesce backward. */ - prev = extent_tree_ad_prev(&dss_chunks_ad, node); - if (prev != NULL && (void *)((uintptr_t)prev->addr + prev->size) == - chunk) { - /* - * Coalesce chunk with the previous address range. This does - * not change the position within dss_chunks_ad, so only - * remove/insert node from/into dss_chunks_szad. - */ - extent_tree_szad_remove(&dss_chunks_szad, prev); - extent_tree_ad_remove(&dss_chunks_ad, prev); - - extent_tree_szad_remove(&dss_chunks_szad, node); - node->addr = prev->addr; - node->size += prev->size; - extent_tree_szad_insert(&dss_chunks_szad, node); - - base_node_dealloc(prev); - } - - return (node); -} - -static bool -chunk_dealloc_dss(void *chunk, size_t size) -{ - - malloc_mutex_lock(&dss_mtx); - if ((uintptr_t)chunk >= (uintptr_t)dss_base - && (uintptr_t)chunk < (uintptr_t)dss_max) { - extent_node_t *node; - - /* Try to coalesce with other unused chunks. */ - node = chunk_dealloc_dss_record(chunk, size); - if (node != NULL) { - chunk = node->addr; - size = node->size; - } - - /* Get the current end of the DSS. */ - dss_max = sbrk(0); - - /* - * Try to shrink the DSS if this chunk is at the end of the - * DSS. The sbrk() call here is subject to a race condition - * with threads that use brk(2) or sbrk(2) directly, but the - * alternative would be to leak memory for the sake of poorly - * designed multi-threaded programs. - */ - if ((void *)((uintptr_t)chunk + size) == dss_max - && (dss_prev = sbrk(-(intptr_t)size)) == dss_max) { - /* Success. */ - dss_max = (void *)((intptr_t)dss_prev - (intptr_t)size); - - if (node != NULL) { - extent_tree_szad_remove(&dss_chunks_szad, node); - extent_tree_ad_remove(&dss_chunks_ad, node); - base_node_dealloc(node); - } - malloc_mutex_unlock(&dss_mtx); - } else { - malloc_mutex_unlock(&dss_mtx); - madvise(chunk, size, MADV_DONTNEED); - } - - return (false); - } - malloc_mutex_unlock(&dss_mtx); - - return (true); -} -#endif - -static void -chunk_dealloc_mmap(void *chunk, size_t size) -{ - - pages_unmap(chunk, size); -} - -static void -chunk_dealloc(void *chunk, size_t size) -{ - - assert(chunk != NULL); - assert(CHUNK_ADDR2BASE(chunk) == chunk); - assert(size != 0); - assert((size & chunksize_mask) == 0); - -#ifdef MALLOC_STATS - stats_chunks.curchunks -= (size / chunksize); -#endif - -#ifdef MALLOC_DSS - if (opt_dss) { - if (chunk_dealloc_dss(chunk, size) == false) - return; - } - - if (opt_mmap) -#endif - chunk_dealloc_mmap(chunk, size); -} - -/* - * End chunk management functions. - */ -/******************************************************************************/ -/* - * Begin arena. - */ - -/* - * Choose an arena based on a per-thread value (fast-path code, calls slow-path - * code if necessary). - */ -static inline arena_t * -choose_arena(void) -{ - arena_t *ret; - - /* - * We can only use TLS if this is a PIC library, since for the static - * library version, libc's malloc is used by TLS allocation, which - * introduces a bootstrapping issue. - */ -#ifndef NO_TLS - if (__isthreaded == false) { - /* Avoid the overhead of TLS for single-threaded operation. */ - return (arenas[0]); - } - - ret = arenas_map; - if (ret == NULL) { - ret = choose_arena_hard(); - assert(ret != NULL); - } -#else - if (__isthreaded && narenas > 1) { - unsigned long ind; - - /* - * Hash pthread_self() to one of the arenas. There is a prime - * number of arenas, so this has a reasonable chance of - * working. Even so, the hashing can be easily thwarted by - * inconvenient pthread_self() values. Without specific - * knowledge of how pthread_self() calculates values, we can't - * easily do much better than this. - */ - ind = (unsigned long) pthread_self() % narenas; - - /* - * Optimistially assume that arenas[ind] has been initialized. - * At worst, we find out that some other thread has already - * done so, after acquiring the lock in preparation. Note that - * this lazy locking also has the effect of lazily forcing - * cache coherency; without the lock acquisition, there's no - * guarantee that modification of arenas[ind] by another thread - * would be seen on this CPU for an arbitrary amount of time. - * - * In general, this approach to modifying a synchronized value - * isn't a good idea, but in this case we only ever modify the - * value once, so things work out well. - */ - ret = arenas[ind]; - if (ret == NULL) { - /* - * Avoid races with another thread that may have already - * initialized arenas[ind]. - */ - malloc_spin_lock(&arenas_lock); - if (arenas[ind] == NULL) - ret = arenas_extend((unsigned)ind); - else - ret = arenas[ind]; - malloc_spin_unlock(&arenas_lock); - } - } else - ret = arenas[0]; -#endif - - assert(ret != NULL); - return (ret); -} - -#ifndef NO_TLS -/* - * Choose an arena based on a per-thread value (slow-path code only, called - * only by choose_arena()). - */ -static arena_t * -choose_arena_hard(void) -{ - arena_t *ret; - - assert(__isthreaded); - -#ifdef MALLOC_BALANCE - /* Seed the PRNG used for arena load balancing. */ - SPRN(balance, (uint32_t)(uintptr_t)(pthread_self())); -#endif - - if (narenas > 1) { -#ifdef MALLOC_BALANCE - unsigned ind; - - ind = PRN(balance, narenas_2pow); - if ((ret = arenas[ind]) == NULL) { - malloc_spin_lock(&arenas_lock); - if ((ret = arenas[ind]) == NULL) - ret = arenas_extend(ind); - malloc_spin_unlock(&arenas_lock); - } -#else - malloc_spin_lock(&arenas_lock); - if ((ret = arenas[next_arena]) == NULL) - ret = arenas_extend(next_arena); - next_arena = (next_arena + 1) % narenas; - malloc_spin_unlock(&arenas_lock); -#endif - } else - ret = arenas[0]; - - arenas_map = ret; - - return (ret); -} -#endif - -static inline int -arena_chunk_comp(arena_chunk_t *a, arena_chunk_t *b) -{ - uintptr_t a_chunk = (uintptr_t)a; - uintptr_t b_chunk = (uintptr_t)b; - - assert(a != NULL); - assert(b != NULL); - - return ((a_chunk > b_chunk) - (a_chunk < b_chunk)); -} - -/* Wrap red-black tree macros in functions. */ -rb_wrap(static ATTRIBUTE_UNUSED, arena_chunk_tree_dirty_, arena_chunk_tree_t, - arena_chunk_t, link_dirty, arena_chunk_comp) - -static inline int -arena_run_comp(arena_chunk_map_t *a, arena_chunk_map_t *b) -{ - uintptr_t a_mapelm = (uintptr_t)a; - uintptr_t b_mapelm = (uintptr_t)b; - - assert(a != NULL); - assert(b != NULL); - - return ((a_mapelm > b_mapelm) - (a_mapelm < b_mapelm)); -} - -/* Wrap red-black tree macros in functions. */ -rb_wrap(static ATTRIBUTE_UNUSED, arena_run_tree_, arena_run_tree_t, arena_chunk_map_t, - link, arena_run_comp) - -static inline int -arena_avail_comp(arena_chunk_map_t *a, arena_chunk_map_t *b) -{ - int ret; - size_t a_size = a->bits & ~pagesize_mask; - size_t b_size = b->bits & ~pagesize_mask; - - ret = (a_size > b_size) - (a_size < b_size); - if (ret == 0) { - uintptr_t a_mapelm, b_mapelm; - - if ((a->bits & CHUNK_MAP_KEY) == 0) - a_mapelm = (uintptr_t)a; - else { - /* - * Treat keys as though they are lower than anything - * else. - */ - a_mapelm = 0; - } - b_mapelm = (uintptr_t)b; - - ret = (a_mapelm > b_mapelm) - (a_mapelm < b_mapelm); - } - - return (ret); -} - -/* Wrap red-black tree macros in functions. */ -rb_wrap(static ATTRIBUTE_UNUSED, arena_avail_tree_, arena_avail_tree_t, - arena_chunk_map_t, link, arena_avail_comp) - -static inline void * -arena_run_reg_alloc(arena_run_t *run, arena_bin_t *bin) -{ - void *ret; - unsigned i, mask, bit, regind; - - assert(run->magic == ARENA_RUN_MAGIC); - assert(run->regs_minelm < bin->regs_mask_nelms); - - /* - * Move the first check outside the loop, so that run->regs_minelm can - * be updated unconditionally, without the possibility of updating it - * multiple times. - */ - i = run->regs_minelm; - mask = run->regs_mask[i]; - if (mask != 0) { - /* Usable allocation found. */ - bit = ffs((int)mask) - 1; - - regind = ((i << (SIZEOF_INT_2POW + 3)) + bit); - assert(regind < bin->nregs); - ret = (void *)(((uintptr_t)run) + bin->reg0_offset - + (bin->reg_size * regind)); - - /* Clear bit. */ - mask ^= (1U << bit); - run->regs_mask[i] = mask; - - return (ret); - } - - for (i++; i < bin->regs_mask_nelms; i++) { - mask = run->regs_mask[i]; - if (mask != 0) { - /* Usable allocation found. */ - bit = ffs((int)mask) - 1; - - regind = ((i << (SIZEOF_INT_2POW + 3)) + bit); - assert(regind < bin->nregs); - ret = (void *)(((uintptr_t)run) + bin->reg0_offset - + (bin->reg_size * regind)); - - /* Clear bit. */ - mask ^= (1U << bit); - run->regs_mask[i] = mask; - - /* - * Make a note that nothing before this element - * contains a free region. - */ - run->regs_minelm = i; /* Low payoff: + (mask == 0); */ - - return (ret); - } - } - /* Not reached. */ - assert(0); - return (NULL); -} - -static inline void -arena_run_reg_dalloc(arena_run_t *run, arena_bin_t *bin, void *ptr, size_t size) -{ - unsigned diff, regind, elm, bit; - - assert(run->magic == ARENA_RUN_MAGIC); - - /* - * Avoid doing division with a variable divisor if possible. Using - * actual division here can reduce allocator throughput by over 20%! - */ - diff = (unsigned)((uintptr_t)ptr - (uintptr_t)run - bin->reg0_offset); - if ((size & (size - 1)) == 0) { - /* - * log2_table allows fast division of a power of two in the - * [1..128] range. - * - * (x / divisor) becomes (x >> log2_table[divisor - 1]). - */ - static const unsigned char log2_table[] = { - 0, 1, 0, 2, 0, 0, 0, 3, 0, 0, 0, 0, 0, 0, 0, 4, - 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 5, - 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, - 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 6, - 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, - 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, - 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, - 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 7 - }; - - if (size <= 128) - regind = (diff >> log2_table[size - 1]); - else if (size <= 32768) - regind = diff >> (8 + log2_table[(size >> 8) - 1]); - else - regind = diff / size; - } else if (size < qspace_max) { - /* - * To divide by a number D that is not a power of two we - * multiply by (2^21 / D) and then right shift by 21 positions. - * - * X / D - * - * becomes - * - * (X * qsize_invs[(D >> QUANTUM_2POW) - 3]) - * >> SIZE_INV_SHIFT - * - * We can omit the first three elements, because we never - * divide by 0, and QUANTUM and 2*QUANTUM are both powers of - * two, which are handled above. - */ -#define SIZE_INV_SHIFT 21 -#define QSIZE_INV(s) (((1U << SIZE_INV_SHIFT) / (s << QUANTUM_2POW)) + 1) - static const unsigned qsize_invs[] = { - QSIZE_INV(3), - QSIZE_INV(4), QSIZE_INV(5), QSIZE_INV(6), QSIZE_INV(7) -#if (QUANTUM_2POW < 4) - , - QSIZE_INV(8), QSIZE_INV(9), QSIZE_INV(10), QSIZE_INV(11), - QSIZE_INV(12),QSIZE_INV(13), QSIZE_INV(14), QSIZE_INV(15) -#endif - }; - assert(QUANTUM * (((sizeof(qsize_invs)) / sizeof(unsigned)) + 3) - >= (1U << QSPACE_MAX_2POW_DEFAULT)); - - if (size <= (((sizeof(qsize_invs) / sizeof(unsigned)) + 2) << - QUANTUM_2POW)) { - regind = qsize_invs[(size >> QUANTUM_2POW) - 3] * diff; - regind >>= SIZE_INV_SHIFT; - } else - regind = diff / size; -#undef QSIZE_INV - } else if (size < cspace_max) { -#define CSIZE_INV(s) (((1U << SIZE_INV_SHIFT) / (s << CACHELINE_2POW)) + 1) - static const unsigned csize_invs[] = { - CSIZE_INV(3), - CSIZE_INV(4), CSIZE_INV(5), CSIZE_INV(6), CSIZE_INV(7) - }; - assert(CACHELINE * (((sizeof(csize_invs)) / sizeof(unsigned)) + - 3) >= (1U << CSPACE_MAX_2POW_DEFAULT)); - - if (size <= (((sizeof(csize_invs) / sizeof(unsigned)) + 2) << - CACHELINE_2POW)) { - regind = csize_invs[(size >> CACHELINE_2POW) - 3] * - diff; - regind >>= SIZE_INV_SHIFT; - } else - regind = diff / size; -#undef CSIZE_INV - } else { -#define SSIZE_INV(s) (((1U << SIZE_INV_SHIFT) / (s << SUBPAGE_2POW)) + 1) - static const unsigned ssize_invs[] = { - SSIZE_INV(3), - SSIZE_INV(4), SSIZE_INV(5), SSIZE_INV(6), SSIZE_INV(7), - SSIZE_INV(8), SSIZE_INV(9), SSIZE_INV(10), SSIZE_INV(11), - SSIZE_INV(12), SSIZE_INV(13), SSIZE_INV(14), SSIZE_INV(15) -#if (PAGESIZE_2POW == 13) - , - SSIZE_INV(16), SSIZE_INV(17), SSIZE_INV(18), SSIZE_INV(19), - SSIZE_INV(20), SSIZE_INV(21), SSIZE_INV(22), SSIZE_INV(23), - SSIZE_INV(24), SSIZE_INV(25), SSIZE_INV(26), SSIZE_INV(27), - SSIZE_INV(28), SSIZE_INV(29), SSIZE_INV(29), SSIZE_INV(30) -#endif - }; - assert(SUBPAGE * (((sizeof(ssize_invs)) / sizeof(unsigned)) + 3) - >= (1U << PAGESIZE_2POW)); - - if (size < (((sizeof(ssize_invs) / sizeof(unsigned)) + 2) << - SUBPAGE_2POW)) { - regind = ssize_invs[(size >> SUBPAGE_2POW) - 3] * diff; - regind >>= SIZE_INV_SHIFT; - } else - regind = diff / size; -#undef SSIZE_INV - } -#undef SIZE_INV_SHIFT - assert(diff == regind * size); - assert(regind < bin->nregs); - - elm = regind >> (SIZEOF_INT_2POW + 3); - if (elm < run->regs_minelm) - run->regs_minelm = elm; - bit = regind - (elm << (SIZEOF_INT_2POW + 3)); - assert((run->regs_mask[elm] & (1U << bit)) == 0); - run->regs_mask[elm] |= (1U << bit); -} - -static void -arena_run_split(arena_t *arena, arena_run_t *run, size_t size, bool large, - bool zero) -{ - arena_chunk_t *chunk; - size_t old_ndirty, run_ind, total_pages, need_pages, rem_pages, i; - - chunk = (arena_chunk_t *)CHUNK_ADDR2BASE(run); - old_ndirty = chunk->ndirty; - run_ind = (unsigned)(((uintptr_t)run - (uintptr_t)chunk) - >> pagesize_2pow); - total_pages = (chunk->map[run_ind].bits & ~pagesize_mask) >> - pagesize_2pow; - need_pages = (size >> pagesize_2pow); - assert(need_pages > 0); - assert(need_pages <= total_pages); - rem_pages = total_pages - need_pages; - - arena_avail_tree_remove(&arena->runs_avail, &chunk->map[run_ind]); - - /* Keep track of trailing unused pages for later use. */ - if (rem_pages > 0) { - chunk->map[run_ind+need_pages].bits = (rem_pages << - pagesize_2pow) | (chunk->map[run_ind+need_pages].bits & - pagesize_mask); - chunk->map[run_ind+total_pages-1].bits = (rem_pages << - pagesize_2pow) | (chunk->map[run_ind+total_pages-1].bits & - pagesize_mask); - arena_avail_tree_insert(&arena->runs_avail, - &chunk->map[run_ind+need_pages]); - } - - for (i = 0; i < need_pages; i++) { - /* Zero if necessary. */ - if (zero) { - if ((chunk->map[run_ind + i].bits & CHUNK_MAP_ZEROED) - == 0) { - memset((void *)((uintptr_t)chunk + ((run_ind - + i) << pagesize_2pow)), 0, pagesize); - /* CHUNK_MAP_ZEROED is cleared below. */ - } - } - - /* Update dirty page accounting. */ - if (chunk->map[run_ind + i].bits & CHUNK_MAP_DIRTY) { - chunk->ndirty--; - arena->ndirty--; - /* CHUNK_MAP_DIRTY is cleared below. */ - } - - /* Initialize the chunk map. */ - if (large) { - chunk->map[run_ind + i].bits = CHUNK_MAP_LARGE - | CHUNK_MAP_ALLOCATED; - } else { - chunk->map[run_ind + i].bits = (size_t)run - | CHUNK_MAP_ALLOCATED; - } - } - - /* - * Set the run size only in the first element for large runs. This is - * primarily a debugging aid, since the lack of size info for trailing - * pages only matters if the application tries to operate on an - * interior pointer. - */ - if (large) - chunk->map[run_ind].bits |= size; - - if (chunk->ndirty == 0 && old_ndirty > 0) - arena_chunk_tree_dirty_remove(&arena->chunks_dirty, chunk); -} - -static arena_chunk_t * -arena_chunk_alloc(arena_t *arena) -{ - arena_chunk_t *chunk; - size_t i; - - if (arena->spare != NULL) { - chunk = arena->spare; - arena->spare = NULL; - } else { - chunk = (arena_chunk_t *)chunk_alloc(chunksize, true); - if (chunk == NULL) - return (NULL); -#ifdef MALLOC_STATS - arena->stats.mapped += chunksize; -#endif - - chunk->arena = arena; - - /* - * Claim that no pages are in use, since the header is merely - * overhead. - */ - chunk->ndirty = 0; - - /* - * Initialize the map to contain one maximal free untouched run. - */ - for (i = 0; i < arena_chunk_header_npages; i++) - chunk->map[i].bits = 0; - chunk->map[i].bits = arena_maxclass | CHUNK_MAP_ZEROED; - for (i++; i < chunk_npages-1; i++) { - chunk->map[i].bits = CHUNK_MAP_ZEROED; - } - chunk->map[chunk_npages-1].bits = arena_maxclass | - CHUNK_MAP_ZEROED; - } - - /* Insert the run into the runs_avail tree. */ - arena_avail_tree_insert(&arena->runs_avail, - &chunk->map[arena_chunk_header_npages]); - - return (chunk); -} - -static void -arena_chunk_dealloc(arena_t *arena, arena_chunk_t *chunk) -{ - - if (arena->spare != NULL) { - if (arena->spare->ndirty > 0) { - arena_chunk_tree_dirty_remove( - &chunk->arena->chunks_dirty, arena->spare); - arena->ndirty -= arena->spare->ndirty; - } - chunk_dealloc((void *)arena->spare, chunksize); -#ifdef MALLOC_STATS - arena->stats.mapped -= chunksize; -#endif - } - - /* - * Remove run from runs_avail, regardless of whether this chunk - * will be cached, so that the arena does not use it. Dirty page - * flushing only uses the chunks_dirty tree, so leaving this chunk in - * the chunks_* trees is sufficient for that purpose. - */ - arena_avail_tree_remove(&arena->runs_avail, - &chunk->map[arena_chunk_header_npages]); - - arena->spare = chunk; -} - -static arena_run_t * -arena_run_alloc(arena_t *arena, size_t size, bool large, bool zero) -{ - arena_chunk_t *chunk; - arena_run_t *run; - arena_chunk_map_t *mapelm, key; - - assert(size <= arena_maxclass); - assert((size & pagesize_mask) == 0); - - /* Search the arena's chunks for the lowest best fit. */ - key.bits = size | CHUNK_MAP_KEY; - mapelm = arena_avail_tree_nsearch(&arena->runs_avail, &key); - if (mapelm != NULL) { - arena_chunk_t *run_chunk = CHUNK_ADDR2BASE(mapelm); - size_t pageind = ((uintptr_t)mapelm - (uintptr_t)run_chunk->map) - / sizeof(arena_chunk_map_t); - - run = (arena_run_t *)((uintptr_t)run_chunk + (pageind - << pagesize_2pow)); - arena_run_split(arena, run, size, large, zero); - return (run); - } - - /* - * No usable runs. Create a new chunk from which to allocate the run. - */ - chunk = arena_chunk_alloc(arena); - if (chunk == NULL) - return (NULL); - run = (arena_run_t *)((uintptr_t)chunk + (arena_chunk_header_npages << - pagesize_2pow)); - /* Update page map. */ - arena_run_split(arena, run, size, large, zero); - return (run); -} - -static void -arena_purge(arena_t *arena) -{ - arena_chunk_t *chunk; - size_t i, npages; -#ifdef MALLOC_DEBUG - size_t ndirty = 0; - - rb_foreach_begin(arena_chunk_t, link_dirty, &arena->chunks_dirty, - chunk) { - ndirty += chunk->ndirty; - } rb_foreach_end(arena_chunk_t, link_dirty, &arena->chunks_dirty, chunk) - assert(ndirty == arena->ndirty); -#endif - assert(arena->ndirty > opt_dirty_max); - -#ifdef MALLOC_STATS - arena->stats.npurge++; -#endif - - /* - * Iterate downward through chunks until enough dirty memory has been - * purged. Terminate as soon as possible in order to minimize the - * number of system calls, even if a chunk has only been partially - * purged. - */ - while (arena->ndirty > (opt_dirty_max >> 1)) { - chunk = arena_chunk_tree_dirty_last(&arena->chunks_dirty); - assert(chunk != NULL); - - for (i = chunk_npages - 1; chunk->ndirty > 0; i--) { - assert(i >= arena_chunk_header_npages); - - if (chunk->map[i].bits & CHUNK_MAP_DIRTY) { - chunk->map[i].bits ^= CHUNK_MAP_DIRTY; - /* Find adjacent dirty run(s). */ - for (npages = 1; i > arena_chunk_header_npages - && (chunk->map[i - 1].bits & - CHUNK_MAP_DIRTY); npages++) { - i--; - chunk->map[i].bits ^= CHUNK_MAP_DIRTY; - } - chunk->ndirty -= npages; - arena->ndirty -= npages; - - madvise((void *)((uintptr_t)chunk + (i << - pagesize_2pow)), (npages << pagesize_2pow), - MADV_DONTNEED); -#ifdef MALLOC_STATS - arena->stats.nmadvise++; - arena->stats.purged += npages; -#endif - if (arena->ndirty <= (opt_dirty_max >> 1)) - break; - } - } - - if (chunk->ndirty == 0) { - arena_chunk_tree_dirty_remove(&arena->chunks_dirty, - chunk); - } - } -} - -static void -arena_run_dalloc(arena_t *arena, arena_run_t *run, bool dirty) -{ - arena_chunk_t *chunk; - size_t size, run_ind, run_pages; - - chunk = (arena_chunk_t *)CHUNK_ADDR2BASE(run); - run_ind = (size_t)(((uintptr_t)run - (uintptr_t)chunk) - >> pagesize_2pow); - assert(run_ind >= arena_chunk_header_npages); - assert(run_ind < chunk_npages); - if ((chunk->map[run_ind].bits & CHUNK_MAP_LARGE) != 0) - size = chunk->map[run_ind].bits & ~pagesize_mask; - else - size = run->bin->run_size; - run_pages = (size >> pagesize_2pow); - - /* Mark pages as unallocated in the chunk map. */ - if (dirty) { - size_t i; - - for (i = 0; i < run_pages; i++) { - assert((chunk->map[run_ind + i].bits & CHUNK_MAP_DIRTY) - == 0); - chunk->map[run_ind + i].bits = CHUNK_MAP_DIRTY; - } - - if (chunk->ndirty == 0) { - arena_chunk_tree_dirty_insert(&arena->chunks_dirty, - chunk); - } - chunk->ndirty += run_pages; - arena->ndirty += run_pages; - } else { - size_t i; - - for (i = 0; i < run_pages; i++) { - chunk->map[run_ind + i].bits &= ~(CHUNK_MAP_LARGE | - CHUNK_MAP_ALLOCATED); - } - } - chunk->map[run_ind].bits = size | (chunk->map[run_ind].bits & - pagesize_mask); - chunk->map[run_ind+run_pages-1].bits = size | - (chunk->map[run_ind+run_pages-1].bits & pagesize_mask); - - /* Try to coalesce forward. */ - if (run_ind + run_pages < chunk_npages && - (chunk->map[run_ind+run_pages].bits & CHUNK_MAP_ALLOCATED) == 0) { - size_t nrun_size = chunk->map[run_ind+run_pages].bits & - ~pagesize_mask; - - /* - * Remove successor from runs_avail; the coalesced run is - * inserted later. - */ - arena_avail_tree_remove(&arena->runs_avail, - &chunk->map[run_ind+run_pages]); - - size += nrun_size; - run_pages = size >> pagesize_2pow; - - assert((chunk->map[run_ind+run_pages-1].bits & ~pagesize_mask) - == nrun_size); - chunk->map[run_ind].bits = size | (chunk->map[run_ind].bits & - pagesize_mask); - chunk->map[run_ind+run_pages-1].bits = size | - (chunk->map[run_ind+run_pages-1].bits & pagesize_mask); - } - - /* Try to coalesce backward. */ - if (run_ind > arena_chunk_header_npages && (chunk->map[run_ind-1].bits & - CHUNK_MAP_ALLOCATED) == 0) { - size_t prun_size = chunk->map[run_ind-1].bits & ~pagesize_mask; - - run_ind -= prun_size >> pagesize_2pow; - - /* - * Remove predecessor from runs_avail; the coalesced run is - * inserted later. - */ - arena_avail_tree_remove(&arena->runs_avail, - &chunk->map[run_ind]); - - size += prun_size; - run_pages = size >> pagesize_2pow; - - assert((chunk->map[run_ind].bits & ~pagesize_mask) == - prun_size); - chunk->map[run_ind].bits = size | (chunk->map[run_ind].bits & - pagesize_mask); - chunk->map[run_ind+run_pages-1].bits = size | - (chunk->map[run_ind+run_pages-1].bits & pagesize_mask); - } - - /* Insert into runs_avail, now that coalescing is complete. */ - arena_avail_tree_insert(&arena->runs_avail, &chunk->map[run_ind]); - - /* Deallocate chunk if it is now completely unused. */ - if ((chunk->map[arena_chunk_header_npages].bits & (~pagesize_mask | - CHUNK_MAP_ALLOCATED)) == arena_maxclass) - arena_chunk_dealloc(arena, chunk); - - /* Enforce opt_dirty_max. */ - if (arena->ndirty > opt_dirty_max) - arena_purge(arena); -} - -static void -arena_run_trim_head(arena_t *arena, arena_chunk_t *chunk, arena_run_t *run, - size_t oldsize, size_t newsize) -{ - size_t pageind = ((uintptr_t)run - (uintptr_t)chunk) >> pagesize_2pow; - size_t head_npages = (oldsize - newsize) >> pagesize_2pow; - - assert(oldsize > newsize); - - /* - * Update the chunk map so that arena_run_dalloc() can treat the - * leading run as separately allocated. - */ - chunk->map[pageind].bits = (oldsize - newsize) | CHUNK_MAP_LARGE | - CHUNK_MAP_ALLOCATED; - chunk->map[pageind+head_npages].bits = newsize | CHUNK_MAP_LARGE | - CHUNK_MAP_ALLOCATED; - - arena_run_dalloc(arena, run, false); -} - -static void -arena_run_trim_tail(arena_t *arena, arena_chunk_t *chunk, arena_run_t *run, - size_t oldsize, size_t newsize, bool dirty) -{ - size_t pageind = ((uintptr_t)run - (uintptr_t)chunk) >> pagesize_2pow; - size_t npages = newsize >> pagesize_2pow; - - assert(oldsize > newsize); - - /* - * Update the chunk map so that arena_run_dalloc() can treat the - * trailing run as separately allocated. - */ - chunk->map[pageind].bits = newsize | CHUNK_MAP_LARGE | - CHUNK_MAP_ALLOCATED; - chunk->map[pageind+npages].bits = (oldsize - newsize) | CHUNK_MAP_LARGE - | CHUNK_MAP_ALLOCATED; - - arena_run_dalloc(arena, (arena_run_t *)((uintptr_t)run + newsize), - dirty); -} - -static arena_run_t * -arena_bin_nonfull_run_get(arena_t *arena, arena_bin_t *bin) -{ - arena_chunk_map_t *mapelm; - arena_run_t *run; - unsigned i, remainder; - - /* Look for a usable run. */ - mapelm = arena_run_tree_first(&bin->runs); - if (mapelm != NULL) { - /* run is guaranteed to have available space. */ - arena_run_tree_remove(&bin->runs, mapelm); - run = (arena_run_t *)(mapelm->bits & ~pagesize_mask); -#ifdef MALLOC_STATS - bin->stats.reruns++; -#endif - return (run); - } - /* No existing runs have any space available. */ - - /* Allocate a new run. */ - run = arena_run_alloc(arena, bin->run_size, false, false); - if (run == NULL) - return (NULL); - - /* Initialize run internals. */ - run->bin = bin; - - for (i = 0; i < bin->regs_mask_nelms - 1; i++) - run->regs_mask[i] = UINT_MAX; - remainder = bin->nregs & ((1U << (SIZEOF_INT_2POW + 3)) - 1); - if (remainder == 0) - run->regs_mask[i] = UINT_MAX; - else { - /* The last element has spare bits that need to be unset. */ - run->regs_mask[i] = (UINT_MAX >> ((1U << (SIZEOF_INT_2POW + 3)) - - remainder)); - } - - run->regs_minelm = 0; - - run->nfree = bin->nregs; -#ifdef MALLOC_DEBUG - run->magic = ARENA_RUN_MAGIC; -#endif - -#ifdef MALLOC_STATS - bin->stats.nruns++; - bin->stats.curruns++; - if (bin->stats.curruns > bin->stats.highruns) - bin->stats.highruns = bin->stats.curruns; -#endif - return (run); -} - -/* bin->runcur must have space available before this function is called. */ -static inline void * -arena_bin_malloc_easy(arena_t *arena, arena_bin_t *bin, arena_run_t *run) -{ - void *ret; - - (void)arena; /* XXX */ - assert(run->magic == ARENA_RUN_MAGIC); - assert(run->nfree > 0); - - ret = arena_run_reg_alloc(run, bin); - assert(ret != NULL); - run->nfree--; - - return (ret); -} - -/* Re-fill bin->runcur, then call arena_bin_malloc_easy(). */ -static void * -arena_bin_malloc_hard(arena_t *arena, arena_bin_t *bin) -{ - - bin->runcur = arena_bin_nonfull_run_get(arena, bin); - if (bin->runcur == NULL) - return (NULL); - assert(bin->runcur->magic == ARENA_RUN_MAGIC); - assert(bin->runcur->nfree > 0); - - return (arena_bin_malloc_easy(arena, bin, bin->runcur)); -} - -/* - * Calculate bin->run_size such that it meets the following constraints: - * - * *) bin->run_size >= min_run_size - * *) bin->run_size <= arena_maxclass - * *) bin->run_size <= RUN_MAX_SMALL - * *) run header overhead <= RUN_MAX_OVRHD (or header overhead relaxed). - * - * bin->nregs, bin->regs_mask_nelms, and bin->reg0_offset are - * also calculated here, since these settings are all interdependent. - */ -static size_t -arena_bin_run_size_calc(arena_bin_t *bin, size_t min_run_size) -{ - size_t try_run_size, good_run_size; - unsigned good_nregs, good_mask_nelms, good_reg0_offset; - unsigned try_nregs, try_mask_nelms, try_reg0_offset; - - assert(min_run_size >= pagesize); - assert(min_run_size <= arena_maxclass); - assert(min_run_size <= RUN_MAX_SMALL); - - /* - * Calculate known-valid settings before entering the run_size - * expansion loop, so that the first part of the loop always copies - * valid settings. - * - * The do..while loop iteratively reduces the number of regions until - * the run header and the regions no longer overlap. A closed formula - * would be quite messy, since there is an interdependency between the - * header's mask length and the number of regions. - */ - try_run_size = min_run_size; - try_nregs = ((try_run_size - sizeof(arena_run_t)) / bin->reg_size) - + 1; /* Counter-act try_nregs-- in loop. */ - do { - try_nregs--; - try_mask_nelms = (try_nregs >> (SIZEOF_INT_2POW + 3)) + - ((try_nregs & ((1U << (SIZEOF_INT_2POW + 3)) - 1)) ? 1 : 0); - try_reg0_offset = try_run_size - (try_nregs * bin->reg_size); - } while (sizeof(arena_run_t) + (sizeof(unsigned) * (try_mask_nelms - 1)) - > try_reg0_offset); - - /* run_size expansion loop. */ - do { - /* - * Copy valid settings before trying more aggressive settings. - */ - good_run_size = try_run_size; - good_nregs = try_nregs; - good_mask_nelms = try_mask_nelms; - good_reg0_offset = try_reg0_offset; - - /* Try more aggressive settings. */ - try_run_size += pagesize; - try_nregs = ((try_run_size - sizeof(arena_run_t)) / - bin->reg_size) + 1; /* Counter-act try_nregs-- in loop. */ - do { - try_nregs--; - try_mask_nelms = (try_nregs >> (SIZEOF_INT_2POW + 3)) + - ((try_nregs & ((1U << (SIZEOF_INT_2POW + 3)) - 1)) ? - 1 : 0); - try_reg0_offset = try_run_size - (try_nregs * - bin->reg_size); - } while (sizeof(arena_run_t) + (sizeof(unsigned) * - (try_mask_nelms - 1)) > try_reg0_offset); - } while (try_run_size <= arena_maxclass && try_run_size <= RUN_MAX_SMALL - && RUN_MAX_OVRHD * (bin->reg_size << 3) > RUN_MAX_OVRHD_RELAX - && (try_reg0_offset << RUN_BFP) > RUN_MAX_OVRHD * try_run_size); - - assert(sizeof(arena_run_t) + (sizeof(unsigned) * (good_mask_nelms - 1)) - <= good_reg0_offset); - assert((good_mask_nelms << (SIZEOF_INT_2POW + 3)) >= good_nregs); - - /* Copy final settings. */ - bin->run_size = good_run_size; - bin->nregs = good_nregs; - bin->regs_mask_nelms = good_mask_nelms; - bin->reg0_offset = good_reg0_offset; - - return (good_run_size); -} - -#ifdef MALLOC_BALANCE -static inline void -arena_lock_balance(arena_t *arena) -{ - unsigned contention; - - contention = malloc_spin_lock(&arena->lock); - if (narenas > 1) { - /* - * Calculate the exponentially averaged contention for this - * arena. Due to integer math always rounding down, this value - * decays somewhat faster than normal. - */ - arena->contention = (((uint64_t)arena->contention - * (uint64_t)((1U << BALANCE_ALPHA_INV_2POW)-1)) - + (uint64_t)contention) >> BALANCE_ALPHA_INV_2POW; - if (arena->contention >= opt_balance_threshold) - arena_lock_balance_hard(arena); - } -} - -static void -arena_lock_balance_hard(arena_t *arena) -{ - uint32_t ind; - - arena->contention = 0; -#ifdef MALLOC_STATS - arena->stats.nbalance++; -#endif - ind = PRN(balance, narenas_2pow); - if (arenas[ind] != NULL) - arenas_map = arenas[ind]; - else { - malloc_spin_lock(&arenas_lock); - if (arenas[ind] != NULL) - arenas_map = arenas[ind]; - else - arenas_map = arenas_extend(ind); - malloc_spin_unlock(&arenas_lock); - } -} -#endif - -#ifdef MALLOC_MAG -static inline void * -mag_alloc(mag_t *mag) -{ - - if (mag->nrounds == 0) - return (NULL); - mag->nrounds--; - - return (mag->rounds[mag->nrounds]); -} - -static void -mag_load(mag_t *mag) -{ - arena_t *arena; - arena_bin_t *bin; - arena_run_t *run; - void *round; - size_t i; - - arena = choose_arena(); - bin = &arena->bins[mag->binind]; -#ifdef MALLOC_BALANCE - arena_lock_balance(arena); -#else - malloc_spin_lock(&arena->lock); -#endif - for (i = mag->nrounds; i < max_rounds; i++) { - if ((run = bin->runcur) != NULL && run->nfree > 0) - round = arena_bin_malloc_easy(arena, bin, run); - else - round = arena_bin_malloc_hard(arena, bin); - if (round == NULL) - break; - mag->rounds[i] = round; - } -#ifdef MALLOC_STATS - bin->stats.nmags++; - arena->stats.nmalloc_small += (i - mag->nrounds); - arena->stats.allocated_small += (i - mag->nrounds) * bin->reg_size; -#endif - malloc_spin_unlock(&arena->lock); - mag->nrounds = i; -} - -static inline void * -mag_rack_alloc(mag_rack_t *rack, size_t size, bool zero) -{ - void *ret; - bin_mags_t *bin_mags; - mag_t *mag; - size_t binind; - - binind = size2bin[size]; - assert(binind < nbins); - bin_mags = &rack->bin_mags[binind]; - - mag = bin_mags->curmag; - if (mag == NULL) { - /* Create an initial magazine for this size class. */ - assert(bin_mags->sparemag == NULL); - mag = mag_create(choose_arena(), binind); - if (mag == NULL) - return (NULL); - bin_mags->curmag = mag; - mag_load(mag); - } - - ret = mag_alloc(mag); - if (ret == NULL) { - if (bin_mags->sparemag != NULL) { - if (bin_mags->sparemag->nrounds > 0) { - /* Swap magazines. */ - bin_mags->curmag = bin_mags->sparemag; - bin_mags->sparemag = mag; - mag = bin_mags->curmag; - } else { - /* Reload the current magazine. */ - mag_load(mag); - } - } else { - /* Create a second magazine. */ - mag = mag_create(choose_arena(), binind); - if (mag == NULL) - return (NULL); - mag_load(mag); - bin_mags->sparemag = bin_mags->curmag; - bin_mags->curmag = mag; - } - ret = mag_alloc(mag); - if (ret == NULL) - return (NULL); - } - - if (zero == false) { - if (opt_junk) - memset(ret, 0xa5, size); - else if (opt_zero) - memset(ret, 0, size); - } else - memset(ret, 0, size); - - return (ret); -} -#endif - -static inline void * -arena_malloc_small(arena_t *arena, size_t size, bool zero) -{ - void *ret; - arena_bin_t *bin; - arena_run_t *run; - size_t binind; - - binind = size2bin[size]; - assert(binind < nbins); - bin = &arena->bins[binind]; - size = bin->reg_size; - -#ifdef MALLOC_BALANCE - arena_lock_balance(arena); -#else - malloc_spin_lock(&arena->lock); -#endif - if ((run = bin->runcur) != NULL && run->nfree > 0) - ret = arena_bin_malloc_easy(arena, bin, run); - else - ret = arena_bin_malloc_hard(arena, bin); - - if (ret == NULL) { - malloc_spin_unlock(&arena->lock); - return (NULL); - } - -#ifdef MALLOC_STATS - bin->stats.nrequests++; - arena->stats.nmalloc_small++; - arena->stats.allocated_small += size; -#endif - malloc_spin_unlock(&arena->lock); - - if (zero == false) { - if (opt_junk) - memset(ret, 0xa5, size); - else if (opt_zero) - memset(ret, 0, size); - } else - memset(ret, 0, size); - - return (ret); -} - -static void * -arena_malloc_large(arena_t *arena, size_t size, bool zero) -{ - void *ret; - - /* Large allocation. */ - size = PAGE_CEILING(size); -#ifdef MALLOC_BALANCE - arena_lock_balance(arena); -#else - malloc_spin_lock(&arena->lock); -#endif - ret = (void *)arena_run_alloc(arena, size, true, zero); - if (ret == NULL) { - malloc_spin_unlock(&arena->lock); - return (NULL); - } -#ifdef MALLOC_STATS - arena->stats.nmalloc_large++; - arena->stats.allocated_large += size; -#endif - malloc_spin_unlock(&arena->lock); - - if (zero == false) { - if (opt_junk) - memset(ret, 0xa5, size); - else if (opt_zero) - memset(ret, 0, size); - } - - return (ret); -} - -static inline void * -arena_malloc(arena_t *arena, size_t size, bool zero) -{ - - assert(arena != NULL); - assert(arena->magic == ARENA_MAGIC); - assert(size != 0); - assert(QUANTUM_CEILING(size) <= arena_maxclass); - - if (size <= bin_maxclass) { -#ifdef MALLOC_MAG - if (__isthreaded && opt_mag) { - mag_rack_t *rack = mag_rack; - if (rack == NULL) { - rack = mag_rack_create(arena); - if (rack == NULL) - return (NULL); - mag_rack = rack; - } - return (mag_rack_alloc(rack, size, zero)); - } else -#endif - return (arena_malloc_small(arena, size, zero)); - } else - return (arena_malloc_large(arena, size, zero)); -} - -static inline void * -imalloc(size_t size) -{ - - assert(size != 0); - - if (size <= arena_maxclass) - return (arena_malloc(choose_arena(), size, false)); - else - return (huge_malloc(size, false)); -} - -static inline void * -icalloc(size_t size) -{ - - if (size <= arena_maxclass) - return (arena_malloc(choose_arena(), size, true)); - else - return (huge_malloc(size, true)); -} - -/* Only handles large allocations that require more than page alignment. */ -static void * -arena_palloc(arena_t *arena, size_t alignment, size_t size, size_t alloc_size) -{ - void *ret; - size_t offset; - arena_chunk_t *chunk; - - assert((size & pagesize_mask) == 0); - assert((alignment & pagesize_mask) == 0); - -#ifdef MALLOC_BALANCE - arena_lock_balance(arena); -#else - malloc_spin_lock(&arena->lock); -#endif - ret = (void *)arena_run_alloc(arena, alloc_size, true, false); - if (ret == NULL) { - malloc_spin_unlock(&arena->lock); - return (NULL); - } - - chunk = (arena_chunk_t *)CHUNK_ADDR2BASE(ret); - - offset = (uintptr_t)ret & (alignment - 1); - assert((offset & pagesize_mask) == 0); - assert(offset < alloc_size); - if (offset == 0) - arena_run_trim_tail(arena, chunk, ret, alloc_size, size, false); - else { - size_t leadsize, trailsize; - - leadsize = alignment - offset; - if (leadsize > 0) { - arena_run_trim_head(arena, chunk, ret, alloc_size, - alloc_size - leadsize); - ret = (void *)((uintptr_t)ret + leadsize); - } - - trailsize = alloc_size - leadsize - size; - if (trailsize != 0) { - /* Trim trailing space. */ - assert(trailsize < alloc_size); - arena_run_trim_tail(arena, chunk, ret, size + trailsize, - size, false); - } - } - -#ifdef MALLOC_STATS - arena->stats.nmalloc_large++; - arena->stats.allocated_large += size; -#endif - malloc_spin_unlock(&arena->lock); - - if (opt_junk) - memset(ret, 0xa5, size); - else if (opt_zero) - memset(ret, 0, size); - return (ret); -} - -static inline void * -ipalloc(size_t alignment, size_t size) -{ - void *ret; - size_t ceil_size; - - /* - * Round size up to the nearest multiple of alignment. - * - * This done, we can take advantage of the fact that for each small - * size class, every object is aligned at the smallest power of two - * that is non-zero in the base two representation of the size. For - * example: - * - * Size | Base 2 | Minimum alignment - * -----+----------+------------------ - * 96 | 1100000 | 32 - * 144 | 10100000 | 32 - * 192 | 11000000 | 64 - * - * Depending on runtime settings, it is possible that arena_malloc() - * will further round up to a power of two, but that never causes - * correctness issues. - */ - ceil_size = (size + (alignment - 1)) & (-alignment); - /* - * (ceil_size < size) protects against the combination of maximal - * alignment and size greater than maximal alignment. - */ - if (ceil_size < size) { - /* size_t overflow. */ - return (NULL); - } - - if (ceil_size <= pagesize || (alignment <= pagesize - && ceil_size <= arena_maxclass)) - ret = arena_malloc(choose_arena(), ceil_size, false); - else { - size_t run_size; - - /* - * We can't achieve subpage alignment, so round up alignment - * permanently; it makes later calculations simpler. - */ - alignment = PAGE_CEILING(alignment); - ceil_size = PAGE_CEILING(size); - /* - * (ceil_size < size) protects against very large sizes within - * pagesize of SIZE_T_MAX. - * - * (ceil_size + alignment < ceil_size) protects against the - * combination of maximal alignment and ceil_size large enough - * to cause overflow. This is similar to the first overflow - * check above, but it needs to be repeated due to the new - * ceil_size value, which may now be *equal* to maximal - * alignment, whereas before we only detected overflow if the - * original size was *greater* than maximal alignment. - */ - if (ceil_size < size || ceil_size + alignment < ceil_size) { - /* size_t overflow. */ - return (NULL); - } - - /* - * Calculate the size of the over-size run that arena_palloc() - * would need to allocate in order to guarantee the alignment. - */ - if (ceil_size >= alignment) - run_size = ceil_size + alignment - pagesize; - else { - /* - * It is possible that (alignment << 1) will cause - * overflow, but it doesn't matter because we also - * subtract pagesize, which in the case of overflow - * leaves us with a very large run_size. That causes - * the first conditional below to fail, which means - * that the bogus run_size value never gets used for - * anything important. - */ - run_size = (alignment << 1) - pagesize; - } - - if (run_size <= arena_maxclass) { - ret = arena_palloc(choose_arena(), alignment, ceil_size, - run_size); - } else if (alignment <= chunksize) - ret = huge_malloc(ceil_size, false); - else - ret = huge_palloc(alignment, ceil_size); - } - - assert(((uintptr_t)ret & (alignment - 1)) == 0); - return (ret); -} - -/* Return the size of the allocation pointed to by ptr. */ -static size_t -arena_salloc(const void *ptr) -{ - size_t ret; - arena_chunk_t *chunk; - size_t pageind, mapbits; - - assert(ptr != NULL); - assert(CHUNK_ADDR2BASE(ptr) != ptr); - - chunk = (arena_chunk_t *)CHUNK_ADDR2BASE(ptr); - pageind = (((uintptr_t)ptr - (uintptr_t)chunk) >> pagesize_2pow); - mapbits = chunk->map[pageind].bits; - assert((mapbits & CHUNK_MAP_ALLOCATED) != 0); - if ((mapbits & CHUNK_MAP_LARGE) == 0) { - arena_run_t *run = (arena_run_t *)(mapbits & ~pagesize_mask); - assert(run->magic == ARENA_RUN_MAGIC); - ret = run->bin->reg_size; - } else { - ret = mapbits & ~pagesize_mask; - assert(ret != 0); - } - - return (ret); -} - -static inline size_t -isalloc(const void *ptr) -{ - size_t ret; - arena_chunk_t *chunk; - - assert(ptr != NULL); - - chunk = (arena_chunk_t *)CHUNK_ADDR2BASE(ptr); - if (chunk != ptr) { - /* Region. */ - assert(chunk->arena->magic == ARENA_MAGIC); - - ret = arena_salloc(ptr); - } else { - extent_node_t *node, key; - - /* Chunk (huge allocation). */ - - malloc_mutex_lock(&huge_mtx); - - /* Extract from tree of huge allocations. */ - key.addr = __DECONST(void *, ptr); - node = extent_tree_ad_search(&huge, &key); - assert(node != NULL); - - ret = node->size; - - malloc_mutex_unlock(&huge_mtx); - } - - return (ret); -} - -static inline void -arena_dalloc_small(arena_t *arena, arena_chunk_t *chunk, void *ptr, - arena_chunk_map_t *mapelm) -{ - arena_run_t *run; - arena_bin_t *bin; - size_t size; - - run = (arena_run_t *)(mapelm->bits & ~pagesize_mask); - assert(run->magic == ARENA_RUN_MAGIC); - bin = run->bin; - size = bin->reg_size; - - if (opt_junk) - memset(ptr, 0x5a, size); - - arena_run_reg_dalloc(run, bin, ptr, size); - run->nfree++; - - if (run->nfree == bin->nregs) { - /* Deallocate run. */ - if (run == bin->runcur) - bin->runcur = NULL; - else if (bin->nregs != 1) { - size_t run_pageind = (((uintptr_t)run - - (uintptr_t)chunk)) >> pagesize_2pow; - arena_chunk_map_t *run_mapelm = - &chunk->map[run_pageind]; - /* - * This block's conditional is necessary because if the - * run only contains one region, then it never gets - * inserted into the non-full runs tree. - */ - arena_run_tree_remove(&bin->runs, run_mapelm); - } -#ifdef MALLOC_DEBUG - run->magic = 0; -#endif - arena_run_dalloc(arena, run, true); -#ifdef MALLOC_STATS - bin->stats.curruns--; -#endif - } else if (run->nfree == 1 && run != bin->runcur) { - /* - * Make sure that bin->runcur always refers to the lowest - * non-full run, if one exists. - */ - if (bin->runcur == NULL) - bin->runcur = run; - else if ((uintptr_t)run < (uintptr_t)bin->runcur) { - /* Switch runcur. */ - if (bin->runcur->nfree > 0) { - arena_chunk_t *runcur_chunk = - CHUNK_ADDR2BASE(bin->runcur); - size_t runcur_pageind = - (((uintptr_t)bin->runcur - - (uintptr_t)runcur_chunk)) >> pagesize_2pow; - arena_chunk_map_t *runcur_mapelm = - &runcur_chunk->map[runcur_pageind]; - - /* Insert runcur. */ - arena_run_tree_insert(&bin->runs, - runcur_mapelm); - } - bin->runcur = run; - } else { - size_t run_pageind = (((uintptr_t)run - - (uintptr_t)chunk)) >> pagesize_2pow; - arena_chunk_map_t *run_mapelm = - &chunk->map[run_pageind]; - - assert(arena_run_tree_search(&bin->runs, run_mapelm) == - NULL); - arena_run_tree_insert(&bin->runs, run_mapelm); - } - } -#ifdef MALLOC_STATS - arena->stats.allocated_small -= size; - arena->stats.ndalloc_small++; -#endif -} - -#ifdef MALLOC_MAG -static void -mag_unload(mag_t *mag) -{ - arena_chunk_t *chunk; - arena_t *arena; - void *round; - size_t i, ndeferred, nrounds; - - for (ndeferred = mag->nrounds; ndeferred > 0;) { - nrounds = ndeferred; - /* Lock the arena associated with the first round. */ - chunk = (arena_chunk_t *)CHUNK_ADDR2BASE(mag->rounds[0]); - arena = chunk->arena; -#ifdef MALLOC_BALANCE - arena_lock_balance(arena); -#else - malloc_spin_lock(&arena->lock); -#endif - /* Deallocate every round that belongs to the locked arena. */ - for (i = ndeferred = 0; i < nrounds; i++) { - round = mag->rounds[i]; - chunk = (arena_chunk_t *)CHUNK_ADDR2BASE(round); - if (chunk->arena == arena) { - size_t pageind = (((uintptr_t)round - - (uintptr_t)chunk) >> pagesize_2pow); - arena_chunk_map_t *mapelm = - &chunk->map[pageind]; - arena_dalloc_small(arena, chunk, round, mapelm); - } else { - /* - * This round was allocated via a different - * arena than the one that is currently locked. - * Stash the round, so that it can be handled - * in a future pass. - */ - mag->rounds[ndeferred] = round; - ndeferred++; - } - } - malloc_spin_unlock(&arena->lock); - } - - mag->nrounds = 0; -} - -static inline void -mag_rack_dalloc(mag_rack_t *rack, void *ptr) -{ - arena_t *arena; - arena_chunk_t *chunk; - arena_run_t *run; - arena_bin_t *bin; - bin_mags_t *bin_mags; - mag_t *mag; - size_t pageind, binind; - arena_chunk_map_t *mapelm; - - chunk = (arena_chunk_t *)CHUNK_ADDR2BASE(ptr); - arena = chunk->arena; - pageind = (((uintptr_t)ptr - (uintptr_t)chunk) >> pagesize_2pow); - mapelm = &chunk->map[pageind]; - run = (arena_run_t *)(mapelm->bits & ~pagesize_mask); - assert(run->magic == ARENA_RUN_MAGIC); - bin = run->bin; - binind = ((uintptr_t)bin - (uintptr_t)&arena->bins) / - sizeof(arena_bin_t); - assert(binind < nbins); - - if (opt_junk) - memset(ptr, 0x5a, arena->bins[binind].reg_size); - - bin_mags = &rack->bin_mags[binind]; - mag = bin_mags->curmag; - if (mag == NULL) { - /* Create an initial magazine for this size class. */ - assert(bin_mags->sparemag == NULL); - mag = mag_create(choose_arena(), binind); - if (mag == NULL) { - malloc_spin_lock(&arena->lock); - arena_dalloc_small(arena, chunk, ptr, mapelm); - malloc_spin_unlock(&arena->lock); - return; - } - bin_mags->curmag = mag; - } - - if (mag->nrounds == max_rounds) { - if (bin_mags->sparemag != NULL) { - if (bin_mags->sparemag->nrounds < max_rounds) { - /* Swap magazines. */ - bin_mags->curmag = bin_mags->sparemag; - bin_mags->sparemag = mag; - mag = bin_mags->curmag; - } else { - /* Unload the current magazine. */ - mag_unload(mag); - } - } else { - /* Create a second magazine. */ - mag = mag_create(choose_arena(), binind); - if (mag == NULL) { - mag = rack->bin_mags[binind].curmag; - mag_unload(mag); - } else { - bin_mags->sparemag = bin_mags->curmag; - bin_mags->curmag = mag; - } - } - assert(mag->nrounds < max_rounds); - } - mag->rounds[mag->nrounds] = ptr; - mag->nrounds++; -} -#endif - -static void -arena_dalloc_large(arena_t *arena, arena_chunk_t *chunk, void *ptr) -{ - /* Large allocation. */ - malloc_spin_lock(&arena->lock); - -#ifndef MALLOC_STATS - if (opt_junk) -#endif - { - size_t pageind = ((uintptr_t)ptr - (uintptr_t)chunk) >> - pagesize_2pow; - size_t size = chunk->map[pageind].bits & ~pagesize_mask; - -#ifdef MALLOC_STATS - if (opt_junk) -#endif - memset(ptr, 0x5a, size); -#ifdef MALLOC_STATS - arena->stats.allocated_large -= size; -#endif - } -#ifdef MALLOC_STATS - arena->stats.ndalloc_large++; -#endif - - arena_run_dalloc(arena, (arena_run_t *)ptr, true); - malloc_spin_unlock(&arena->lock); -} - -static inline void -arena_dalloc(arena_t *arena, arena_chunk_t *chunk, void *ptr) -{ - size_t pageind; - arena_chunk_map_t *mapelm; - - assert(arena != NULL); - assert(arena->magic == ARENA_MAGIC); - assert(chunk->arena == arena); - assert(ptr != NULL); - assert(CHUNK_ADDR2BASE(ptr) != ptr); - - pageind = (((uintptr_t)ptr - (uintptr_t)chunk) >> pagesize_2pow); - mapelm = &chunk->map[pageind]; - assert((mapelm->bits & CHUNK_MAP_ALLOCATED) != 0); - if ((mapelm->bits & CHUNK_MAP_LARGE) == 0) { - /* Small allocation. */ -#ifdef MALLOC_MAG - if (__isthreaded && opt_mag) { - mag_rack_t *rack = mag_rack; - if (rack == NULL) { - rack = mag_rack_create(arena); - if (rack == NULL) { - malloc_spin_lock(&arena->lock); - arena_dalloc_small(arena, chunk, ptr, - mapelm); - malloc_spin_unlock(&arena->lock); - } - mag_rack = rack; - } - mag_rack_dalloc(rack, ptr); - } else { -#endif - malloc_spin_lock(&arena->lock); - arena_dalloc_small(arena, chunk, ptr, mapelm); - malloc_spin_unlock(&arena->lock); -#ifdef MALLOC_MAG - } -#endif - } else - arena_dalloc_large(arena, chunk, ptr); -} - -static inline void -idalloc(void *ptr) -{ - arena_chunk_t *chunk; - - assert(ptr != NULL); - - chunk = (arena_chunk_t *)CHUNK_ADDR2BASE(ptr); - if (chunk != ptr) - arena_dalloc(chunk->arena, chunk, ptr); - else - huge_dalloc(ptr); -} - -static void -arena_ralloc_large_shrink(arena_t *arena, arena_chunk_t *chunk, void *ptr, - size_t size, size_t oldsize) -{ - - assert(size < oldsize); - - /* - * Shrink the run, and make trailing pages available for other - * allocations. - */ -#ifdef MALLOC_BALANCE - arena_lock_balance(arena); -#else - malloc_spin_lock(&arena->lock); -#endif - arena_run_trim_tail(arena, chunk, (arena_run_t *)ptr, oldsize, size, - true); -#ifdef MALLOC_STATS - arena->stats.allocated_large -= oldsize - size; -#endif - malloc_spin_unlock(&arena->lock); -} - -static bool -arena_ralloc_large_grow(arena_t *arena, arena_chunk_t *chunk, void *ptr, - size_t size, size_t oldsize) -{ - size_t pageind = ((uintptr_t)ptr - (uintptr_t)chunk) >> pagesize_2pow; - size_t npages = oldsize >> pagesize_2pow; - - assert(oldsize == (chunk->map[pageind].bits & ~pagesize_mask)); - - /* Try to extend the run. */ - assert(size > oldsize); -#ifdef MALLOC_BALANCE - arena_lock_balance(arena); -#else - malloc_spin_lock(&arena->lock); -#endif - if (pageind + npages < chunk_npages && (chunk->map[pageind+npages].bits - & CHUNK_MAP_ALLOCATED) == 0 && (chunk->map[pageind+npages].bits & - ~pagesize_mask) >= size - oldsize) { - /* - * The next run is available and sufficiently large. Split the - * following run, then merge the first part with the existing - * allocation. - */ - arena_run_split(arena, (arena_run_t *)((uintptr_t)chunk + - ((pageind+npages) << pagesize_2pow)), size - oldsize, true, - false); - - chunk->map[pageind].bits = size | CHUNK_MAP_LARGE | - CHUNK_MAP_ALLOCATED; - chunk->map[pageind+npages].bits = CHUNK_MAP_LARGE | - CHUNK_MAP_ALLOCATED; - -#ifdef MALLOC_STATS - arena->stats.allocated_large += size - oldsize; -#endif - malloc_spin_unlock(&arena->lock); - return (false); - } - malloc_spin_unlock(&arena->lock); - - return (true); -} - -/* - * Try to resize a large allocation, in order to avoid copying. This will - * always fail if growing an object, and the following run is already in use. - */ -static bool -arena_ralloc_large(void *ptr, size_t size, size_t oldsize) -{ - size_t psize; - - psize = PAGE_CEILING(size); - if (psize == oldsize) { - /* Same size class. */ - if (opt_junk && size < oldsize) { - memset((void *)((uintptr_t)ptr + size), 0x5a, oldsize - - size); - } - return (false); - } else { - arena_chunk_t *chunk; - arena_t *arena; - - chunk = (arena_chunk_t *)CHUNK_ADDR2BASE(ptr); - arena = chunk->arena; - assert(arena->magic == ARENA_MAGIC); - - if (psize < oldsize) { - /* Fill before shrinking in order avoid a race. */ - if (opt_junk) { - memset((void *)((uintptr_t)ptr + size), 0x5a, - oldsize - size); - } - arena_ralloc_large_shrink(arena, chunk, ptr, psize, - oldsize); - return (false); - } else { - bool ret = arena_ralloc_large_grow(arena, chunk, ptr, - psize, oldsize); - if (ret == false && opt_zero) { - memset((void *)((uintptr_t)ptr + oldsize), 0, - size - oldsize); - } - return (ret); - } - } -} - -static void * -arena_ralloc(void *ptr, size_t size, size_t oldsize) -{ - void *ret; - size_t copysize; - - /* Try to avoid moving the allocation. */ - if (size <= bin_maxclass) { - if (oldsize <= bin_maxclass && size2bin[size] == - size2bin[oldsize]) - goto IN_PLACE; - } else { - if (oldsize > bin_maxclass && oldsize <= arena_maxclass) { - assert(size > bin_maxclass); - if (arena_ralloc_large(ptr, size, oldsize) == false) - return (ptr); - } - } - - /* - * If we get here, then size and oldsize are different enough that we - * need to move the object. In that case, fall back to allocating new - * space and copying. - */ - ret = arena_malloc(choose_arena(), size, false); - if (ret == NULL) - return (NULL); - - /* Junk/zero-filling were already done by arena_malloc(). */ - copysize = (size < oldsize) ? size : oldsize; - memcpy(ret, ptr, copysize); - idalloc(ptr); - return (ret); -IN_PLACE: - if (opt_junk && size < oldsize) - memset((void *)((uintptr_t)ptr + size), 0x5a, oldsize - size); - else if (opt_zero && size > oldsize) - memset((void *)((uintptr_t)ptr + oldsize), 0, size - oldsize); - return (ptr); -} - -static inline void * -iralloc(void *ptr, size_t size) -{ - size_t oldsize; - - assert(ptr != NULL); - assert(size != 0); - - oldsize = isalloc(ptr); - - if (size <= arena_maxclass) - return (arena_ralloc(ptr, size, oldsize)); - else - return (huge_ralloc(ptr, size, oldsize)); -} - -static bool -arena_new(arena_t *arena) -{ - unsigned i; - arena_bin_t *bin; - size_t prev_run_size; - - if (malloc_spin_init(&arena->lock)) - return (true); - -#ifdef MALLOC_STATS - memset(&arena->stats, 0, sizeof(arena_stats_t)); -#endif - - /* Initialize chunks. */ - arena_chunk_tree_dirty_new(&arena->chunks_dirty); - arena->spare = NULL; - - arena->ndirty = 0; - - arena_avail_tree_new(&arena->runs_avail); - -#ifdef MALLOC_BALANCE - arena->contention = 0; -#endif - - /* Initialize bins. */ - prev_run_size = pagesize; - - i = 0; -#ifdef MALLOC_TINY - /* (2^n)-spaced tiny bins. */ - for (; i < ntbins; i++) { - bin = &arena->bins[i]; - bin->runcur = NULL; - arena_run_tree_new(&bin->runs); - - bin->reg_size = (1U << (TINY_MIN_2POW + i)); - - prev_run_size = arena_bin_run_size_calc(bin, prev_run_size); - -#ifdef MALLOC_STATS - memset(&bin->stats, 0, sizeof(malloc_bin_stats_t)); -#endif - } -#endif - - /* Quantum-spaced bins. */ - for (; i < ntbins + nqbins; i++) { - bin = &arena->bins[i]; - bin->runcur = NULL; - arena_run_tree_new(&bin->runs); - - bin->reg_size = (i - ntbins + 1) << QUANTUM_2POW; - - prev_run_size = arena_bin_run_size_calc(bin, prev_run_size); - -#ifdef MALLOC_STATS - memset(&bin->stats, 0, sizeof(malloc_bin_stats_t)); -#endif - } - - /* Cacheline-spaced bins. */ - for (; i < ntbins + nqbins + ncbins; i++) { - bin = &arena->bins[i]; - bin->runcur = NULL; - arena_run_tree_new(&bin->runs); - - bin->reg_size = cspace_min + ((i - (ntbins + nqbins)) << - CACHELINE_2POW); - - prev_run_size = arena_bin_run_size_calc(bin, prev_run_size); - -#ifdef MALLOC_STATS - memset(&bin->stats, 0, sizeof(malloc_bin_stats_t)); -#endif - } - - /* Subpage-spaced bins. */ - for (; i < nbins; i++) { - bin = &arena->bins[i]; - bin->runcur = NULL; - arena_run_tree_new(&bin->runs); - - bin->reg_size = sspace_min + ((i - (ntbins + nqbins + ncbins)) - << SUBPAGE_2POW); - - prev_run_size = arena_bin_run_size_calc(bin, prev_run_size); - -#ifdef MALLOC_STATS - memset(&bin->stats, 0, sizeof(malloc_bin_stats_t)); -#endif - } - -#ifdef MALLOC_DEBUG - arena->magic = ARENA_MAGIC; -#endif - - return (false); -} - -/* Create a new arena and insert it into the arenas array at index ind. */ -static arena_t * -arenas_extend(unsigned ind) -{ - arena_t *ret; - - /* Allocate enough space for trailing bins. */ - ret = (arena_t *)base_alloc(sizeof(arena_t) - + (sizeof(arena_bin_t) * (nbins - 1))); - if (ret != NULL && arena_new(ret) == false) { - arenas[ind] = ret; - return (ret); - } - /* Only reached if there is an OOM error. */ - - /* - * OOM here is quite inconvenient to propagate, since dealing with it - * would require a check for failure in the fast path. Instead, punt - * by using arenas[0]. In practice, this is an extremely unlikely - * failure. - */ - _malloc_message(_getprogname(), - ": (malloc) Error initializing arena\n", "", ""); - if (opt_abort) - abort(); - - return (arenas[0]); -} - -#ifdef MALLOC_MAG -static mag_t * -mag_create(arena_t *arena, size_t binind) -{ - mag_t *ret; - - if (sizeof(mag_t) + (sizeof(void *) * (max_rounds - 1)) <= - bin_maxclass) { - ret = arena_malloc_small(arena, sizeof(mag_t) + (sizeof(void *) - * (max_rounds - 1)), false); - } else { - ret = imalloc(sizeof(mag_t) + (sizeof(void *) * (max_rounds - - 1))); - } - if (ret == NULL) - return (NULL); - ret->binind = binind; - ret->nrounds = 0; - - return (ret); -} - -static void -mag_destroy(mag_t *mag) -{ - arena_t *arena; - arena_chunk_t *chunk; - size_t pageind; - arena_chunk_map_t *mapelm; - - chunk = CHUNK_ADDR2BASE(mag); - arena = chunk->arena; - pageind = (((uintptr_t)mag - (uintptr_t)chunk) >> pagesize_2pow); - mapelm = &chunk->map[pageind]; - - assert(mag->nrounds == 0); - if (sizeof(mag_t) + (sizeof(void *) * (max_rounds - 1)) <= - bin_maxclass) { - malloc_spin_lock(&arena->lock); - arena_dalloc_small(arena, chunk, mag, mapelm); - malloc_spin_unlock(&arena->lock); - } else - idalloc(mag); -} - -static mag_rack_t * -mag_rack_create(arena_t *arena) -{ - - assert(sizeof(mag_rack_t) + (sizeof(bin_mags_t *) * (nbins - 1)) <= - bin_maxclass); - return (arena_malloc_small(arena, sizeof(mag_rack_t) + - (sizeof(bin_mags_t) * (nbins - 1)), true)); -} - -static void -mag_rack_destroy(mag_rack_t *rack) -{ - arena_t *arena; - arena_chunk_t *chunk; - bin_mags_t *bin_mags; - size_t i, pageind; - arena_chunk_map_t *mapelm; - - for (i = 0; i < nbins; i++) { - bin_mags = &rack->bin_mags[i]; - if (bin_mags->curmag != NULL) { - assert(bin_mags->curmag->binind == i); - mag_unload(bin_mags->curmag); - mag_destroy(bin_mags->curmag); - } - if (bin_mags->sparemag != NULL) { - assert(bin_mags->sparemag->binind == i); - mag_unload(bin_mags->sparemag); - mag_destroy(bin_mags->sparemag); - } - } - - chunk = CHUNK_ADDR2BASE(rack); - arena = chunk->arena; - pageind = (((uintptr_t)rack - (uintptr_t)chunk) >> pagesize_2pow); - mapelm = &chunk->map[pageind]; - - malloc_spin_lock(&arena->lock); - arena_dalloc_small(arena, chunk, rack, mapelm); - malloc_spin_unlock(&arena->lock); -} -#endif - -/* - * End arena. - */ -/******************************************************************************/ -/* - * Begin general internal functions. - */ - -static void * -huge_malloc(size_t size, bool zero) -{ - void *ret; - size_t csize; - extent_node_t *node; - - /* Allocate one or more contiguous chunks for this request. */ - - csize = CHUNK_CEILING(size); - if (csize == 0) { - /* size is large enough to cause size_t wrap-around. */ - return (NULL); - } - - /* Allocate an extent node with which to track the chunk. */ - node = base_node_alloc(); - if (node == NULL) - return (NULL); - - ret = chunk_alloc(csize, zero); - if (ret == NULL) { - base_node_dealloc(node); - return (NULL); - } - - /* Insert node into huge. */ - node->addr = ret; - node->size = csize; - - malloc_mutex_lock(&huge_mtx); - extent_tree_ad_insert(&huge, node); -#ifdef MALLOC_STATS - huge_nmalloc++; - huge_allocated += csize; -#endif - malloc_mutex_unlock(&huge_mtx); - - if (zero == false) { - if (opt_junk) - memset(ret, 0xa5, csize); - else if (opt_zero) - memset(ret, 0, csize); - } - - return (ret); -} - -/* Only handles large allocations that require more than chunk alignment. */ -static void * -huge_palloc(size_t alignment, size_t size) -{ - void *ret; - size_t alloc_size, chunk_size, offset; - extent_node_t *node; - - /* - * This allocation requires alignment that is even larger than chunk - * alignment. This means that huge_malloc() isn't good enough. - * - * Allocate almost twice as many chunks as are demanded by the size or - * alignment, in order to assure the alignment can be achieved, then - * unmap leading and trailing chunks. - */ - assert(alignment >= chunksize); - - chunk_size = CHUNK_CEILING(size); - - if (size >= alignment) - alloc_size = chunk_size + alignment - chunksize; - else - alloc_size = (alignment << 1) - chunksize; - - /* Allocate an extent node with which to track the chunk. */ - node = base_node_alloc(); - if (node == NULL) - return (NULL); - - ret = chunk_alloc(alloc_size, false); - if (ret == NULL) { - base_node_dealloc(node); - return (NULL); - } - - offset = (uintptr_t)ret & (alignment - 1); - assert((offset & chunksize_mask) == 0); - assert(offset < alloc_size); - if (offset == 0) { - /* Trim trailing space. */ - chunk_dealloc((void *)((uintptr_t)ret + chunk_size), alloc_size - - chunk_size); - } else { - size_t trailsize; - - /* Trim leading space. */ - chunk_dealloc(ret, alignment - offset); - - ret = (void *)((uintptr_t)ret + (alignment - offset)); - - trailsize = alloc_size - (alignment - offset) - chunk_size; - if (trailsize != 0) { - /* Trim trailing space. */ - assert(trailsize < alloc_size); - chunk_dealloc((void *)((uintptr_t)ret + chunk_size), - trailsize); - } - } - - /* Insert node into huge. */ - node->addr = ret; - node->size = chunk_size; - - malloc_mutex_lock(&huge_mtx); - extent_tree_ad_insert(&huge, node); -#ifdef MALLOC_STATS - huge_nmalloc++; - huge_allocated += chunk_size; -#endif - malloc_mutex_unlock(&huge_mtx); - - if (opt_junk) - memset(ret, 0xa5, chunk_size); - else if (opt_zero) - memset(ret, 0, chunk_size); - - return (ret); -} - -static void * -huge_ralloc(void *ptr, size_t size, size_t oldsize) -{ - void *ret; - size_t copysize; - - /* Avoid moving the allocation if the size class would not change. */ - if (oldsize > arena_maxclass && - CHUNK_CEILING(size) == CHUNK_CEILING(oldsize)) { - if (opt_junk && size < oldsize) { - memset((void *)((uintptr_t)ptr + size), 0x5a, oldsize - - size); - } else if (opt_zero && size > oldsize) { - memset((void *)((uintptr_t)ptr + oldsize), 0, size - - oldsize); - } - return (ptr); - } - - /* - * If we get here, then size and oldsize are different enough that we - * need to use a different size class. In that case, fall back to - * allocating new space and copying. - */ - ret = huge_malloc(size, false); - if (ret == NULL) - return (NULL); - - copysize = (size < oldsize) ? size : oldsize; - memcpy(ret, ptr, copysize); - idalloc(ptr); - return (ret); -} - -static void -huge_dalloc(void *ptr) -{ - extent_node_t *node, key; - - malloc_mutex_lock(&huge_mtx); - - /* Extract from tree of huge allocations. */ - key.addr = ptr; - node = extent_tree_ad_search(&huge, &key); - assert(node != NULL); - assert(node->addr == ptr); - extent_tree_ad_remove(&huge, node); - -#ifdef MALLOC_STATS - huge_ndalloc++; - huge_allocated -= node->size; -#endif - - malloc_mutex_unlock(&huge_mtx); - - /* Unmap chunk. */ -#ifdef MALLOC_DSS - if (opt_dss && opt_junk) - memset(node->addr, 0x5a, node->size); -#endif - chunk_dealloc(node->addr, node->size); - - base_node_dealloc(node); -} - -static void -malloc_print_stats(void) -{ - - if (opt_print_stats) { - char s[UMAX2S_BUFSIZE]; - _malloc_message("___ Begin malloc statistics ___\n", "", "", - ""); - _malloc_message("Assertions ", -#ifdef NDEBUG - "disabled", -#else - "enabled", -#endif - "\n", ""); - _malloc_message("Boolean MALLOC_OPTIONS: ", - opt_abort ? "A" : "a", "", ""); -#ifdef MALLOC_DSS - _malloc_message(opt_dss ? "D" : "d", "", "", ""); -#endif -#ifdef MALLOC_MAG - _malloc_message(opt_mag ? "G" : "g", "", "", ""); -#endif - _malloc_message(opt_junk ? "J" : "j", "", "", ""); -#ifdef MALLOC_DSS - _malloc_message(opt_mmap ? "M" : "m", "", "", ""); -#endif - _malloc_message(opt_utrace ? "PU" : "Pu", - opt_sysv ? "V" : "v", - opt_xmalloc ? "X" : "x", - opt_zero ? "Z\n" : "z\n"); - - _malloc_message("CPUs: ", umax2s(ncpus, s), "\n", ""); - _malloc_message("Max arenas: ", umax2s(narenas, s), "\n", ""); -#ifdef MALLOC_BALANCE - _malloc_message("Arena balance threshold: ", - umax2s(opt_balance_threshold, s), "\n", ""); -#endif - _malloc_message("Pointer size: ", umax2s(sizeof(void *), s), - "\n", ""); - _malloc_message("Quantum size: ", umax2s(QUANTUM, s), "\n", ""); - _malloc_message("Cacheline size (assumed): ", umax2s(CACHELINE, - s), "\n", ""); -#ifdef MALLOC_TINY - _malloc_message("Tiny 2^n-spaced sizes: [", umax2s((1U << - TINY_MIN_2POW), s), "..", ""); - _malloc_message(umax2s((qspace_min >> 1), s), "]\n", "", ""); -#endif - _malloc_message("Quantum-spaced sizes: [", umax2s(qspace_min, - s), "..", ""); - _malloc_message(umax2s(qspace_max, s), "]\n", "", ""); - _malloc_message("Cacheline-spaced sizes: [", umax2s(cspace_min, - s), "..", ""); - _malloc_message(umax2s(cspace_max, s), "]\n", "", ""); - _malloc_message("Subpage-spaced sizes: [", umax2s(sspace_min, - s), "..", ""); - _malloc_message(umax2s(sspace_max, s), "]\n", "", ""); -#ifdef MALLOC_MAG - _malloc_message("Rounds per magazine: ", umax2s(max_rounds, s), - "\n", ""); -#endif - _malloc_message("Max dirty pages per arena: ", - umax2s(opt_dirty_max, s), "\n", ""); - - _malloc_message("Chunk size: ", umax2s(chunksize, s), "", ""); - _malloc_message(" (2^", umax2s(opt_chunk_2pow, s), ")\n", ""); - -#ifdef MALLOC_STATS - { - size_t allocated, mapped; -#ifdef MALLOC_BALANCE - uint64_t nbalance = 0; -#endif - unsigned i; - arena_t *arena; - - /* Calculate and print allocated/mapped stats. */ - - /* arenas. */ - for (i = 0, allocated = 0; i < narenas; i++) { - if (arenas[i] != NULL) { - malloc_spin_lock(&arenas[i]->lock); - allocated += - arenas[i]->stats.allocated_small; - allocated += - arenas[i]->stats.allocated_large; -#ifdef MALLOC_BALANCE - nbalance += arenas[i]->stats.nbalance; -#endif - malloc_spin_unlock(&arenas[i]->lock); - } - } - - /* huge/base. */ - malloc_mutex_lock(&huge_mtx); - allocated += huge_allocated; - mapped = stats_chunks.curchunks * chunksize; - malloc_mutex_unlock(&huge_mtx); - - malloc_mutex_lock(&base_mtx); - mapped += base_mapped; - malloc_mutex_unlock(&base_mtx); - - malloc_printf("Allocated: %zu, mapped: %zu\n", - allocated, mapped); - -#ifdef MALLOC_BALANCE - malloc_printf("Arena balance reassignments: %llu\n", - nbalance); -#endif - - /* Print chunk stats. */ - { - chunk_stats_t chunks_stats; - - malloc_mutex_lock(&huge_mtx); - chunks_stats = stats_chunks; - malloc_mutex_unlock(&huge_mtx); - - malloc_printf("chunks: nchunks " - "highchunks curchunks\n"); - malloc_printf(" %13llu%13lu%13lu\n", - chunks_stats.nchunks, - chunks_stats.highchunks, - chunks_stats.curchunks); - } - - /* Print chunk stats. */ - malloc_printf( - "huge: nmalloc ndalloc allocated\n"); - malloc_printf(" %12llu %12llu %12zu\n", - huge_nmalloc, huge_ndalloc, huge_allocated); - - /* Print stats for each arena. */ - for (i = 0; i < narenas; i++) { - arena = arenas[i]; - if (arena != NULL) { - malloc_printf( - "\narenas[%u]:\n", i); - malloc_spin_lock(&arena->lock); - stats_print(arena); - malloc_spin_unlock(&arena->lock); - } - } - } -#endif /* #ifdef MALLOC_STATS */ - _malloc_message("--- End malloc statistics ---\n", "", "", ""); - } -} - -#ifdef MALLOC_DEBUG -static void -size2bin_validate(void) -{ - size_t i, size, binind; - - assert(size2bin[0] == 0xffU); - i = 1; -# ifdef MALLOC_TINY - /* Tiny. */ - for (; i < (1U << TINY_MIN_2POW); i++) { - size = pow2_ceil(1U << TINY_MIN_2POW); - binind = ffs((int)(size >> (TINY_MIN_2POW + 1))); - assert(size2bin[i] == binind); - } - for (; i < qspace_min; i++) { - size = pow2_ceil(i); - binind = ffs((int)(size >> (TINY_MIN_2POW + 1))); - assert(size2bin[i] == binind); - } -# endif - /* Quantum-spaced. */ - for (; i <= qspace_max; i++) { - size = QUANTUM_CEILING(i); - binind = ntbins + (size >> QUANTUM_2POW) - 1; - assert(size2bin[i] == binind); - } - /* Cacheline-spaced. */ - for (; i <= cspace_max; i++) { - size = CACHELINE_CEILING(i); - binind = ntbins + nqbins + ((size - cspace_min) >> - CACHELINE_2POW); - assert(size2bin[i] == binind); - } - /* Sub-page. */ - for (; i <= sspace_max; i++) { - size = SUBPAGE_CEILING(i); - binind = ntbins + nqbins + ncbins + ((size - sspace_min) - >> SUBPAGE_2POW); - assert(size2bin[i] == binind); - } -} -#endif - -static bool -size2bin_init(void) -{ - - if (opt_qspace_max_2pow != QSPACE_MAX_2POW_DEFAULT - || opt_cspace_max_2pow != CSPACE_MAX_2POW_DEFAULT) - return (size2bin_init_hard()); - - size2bin = const_size2bin; -#ifdef MALLOC_DEBUG - assert(sizeof(const_size2bin) == bin_maxclass + 1); - size2bin_validate(); -#endif - return (false); -} - -static bool -size2bin_init_hard(void) -{ - size_t i, size, binind; - uint8_t *custom_size2bin; - - assert(opt_qspace_max_2pow != QSPACE_MAX_2POW_DEFAULT - || opt_cspace_max_2pow != CSPACE_MAX_2POW_DEFAULT); - - custom_size2bin = (uint8_t *)base_alloc(bin_maxclass + 1); - if (custom_size2bin == NULL) - return (true); - - custom_size2bin[0] = 0xffU; - i = 1; -#ifdef MALLOC_TINY - /* Tiny. */ - for (; i < (1U << TINY_MIN_2POW); i++) { - size = pow2_ceil(1U << TINY_MIN_2POW); - binind = ffs((int)(size >> (TINY_MIN_2POW + 1))); - custom_size2bin[i] = binind; - } - for (; i < qspace_min; i++) { - size = pow2_ceil(i); - binind = ffs((int)(size >> (TINY_MIN_2POW + 1))); - custom_size2bin[i] = binind; - } -#endif - /* Quantum-spaced. */ - for (; i <= qspace_max; i++) { - size = QUANTUM_CEILING(i); - binind = ntbins + (size >> QUANTUM_2POW) - 1; - custom_size2bin[i] = binind; - } - /* Cacheline-spaced. */ - for (; i <= cspace_max; i++) { - size = CACHELINE_CEILING(i); - binind = ntbins + nqbins + ((size - cspace_min) >> - CACHELINE_2POW); - custom_size2bin[i] = binind; - } - /* Sub-page. */ - for (; i <= sspace_max; i++) { - size = SUBPAGE_CEILING(i); - binind = ntbins + nqbins + ncbins + ((size - sspace_min) >> - SUBPAGE_2POW); - custom_size2bin[i] = binind; - } - - size2bin = custom_size2bin; -#ifdef MALLOC_DEBUG - size2bin_validate(); -#endif - return (false); -} - -static unsigned -malloc_ncpus(void) -{ - unsigned ret; - int fd, nread, column; - char buf[1]; - static const char matchstr[] = "processor\t:"; - - /* - * sysconf(3) would be the preferred method for determining the number - * of CPUs, but it uses malloc internally, which causes untennable - * recursion during malloc initialization. - */ - fd = open("/proc/cpuinfo", O_RDONLY); - if (fd == -1) - return (1); /* Error. */ - /* - * Count the number of occurrences of matchstr at the beginnings of - * lines. This treats hyperthreaded CPUs as multiple processors. - */ - column = 0; - ret = 0; - while (true) { - nread = read(fd, &buf, sizeof(buf)); - if (nread <= 0) - break; /* EOF or error. */ - - if (buf[0] == '\n') - column = 0; - else if (column != -1) { - if (buf[0] == matchstr[column]) { - column++; - if (column == sizeof(matchstr) - 1) { - column = -1; - ret++; - } - } else - column = -1; - } - } - if (ret == 0) - ret = 1; /* Something went wrong in the parser. */ - close(fd); - - return (ret); -} -/* - * FreeBSD's pthreads implementation calls malloc(3), so the malloc - * implementation has to take pains to avoid infinite recursion during - * initialization. - */ -static inline bool -malloc_init(void) -{ - - if (malloc_initialized == false) - return (malloc_init_hard()); - - return (false); -} - -static bool -malloc_init_hard(void) -{ - unsigned i; - int linklen; - char buf[PATH_MAX + 1]; - const char *opts; - - malloc_mutex_lock(&init_lock); - if (malloc_initialized) { - /* - * Another thread initialized the allocator before this one - * acquired init_lock. - */ - malloc_mutex_unlock(&init_lock); - return (false); - } - - /* Get number of CPUs. */ - ncpus = malloc_ncpus(); - - /* Get page size. */ - { - long result; - - result = sysconf(_SC_PAGESIZE); - assert(result != -1); - pagesize = (unsigned)result; - - /* - * We assume that pagesize is a power of 2 when calculating - * pagesize_mask and pagesize_2pow. - */ - assert(((result - 1) & result) == 0); - pagesize_mask = result - 1; - pagesize_2pow = ffs((int)result) - 1; - } - - for (i = 0; i < 3; i++) { - unsigned j; - - /* Get runtime configuration. */ - switch (i) { - case 0: - if ((linklen = readlink("/etc/malloc.conf", buf, - sizeof(buf) - 1)) != -1) { - /* - * Use the contents of the "/etc/malloc.conf" - * symbolic link's name. - */ - buf[linklen] = '\0'; - opts = buf; - } else { - /* No configuration specified. */ - buf[0] = '\0'; - opts = buf; - } - break; - case 1: - if (issetugid() == 0 && (opts = - getenv("MALLOC_OPTIONS")) != NULL) { - /* - * Do nothing; opts is already initialized to - * the value of the MALLOC_OPTIONS environment - * variable. - */ - } else { - /* No configuration specified. */ - buf[0] = '\0'; - opts = buf; - } - break; - case 2: - if (_malloc_options != NULL) { - /* - * Use options that were compiled into the - * program. - */ - opts = _malloc_options; - } else { - /* No configuration specified. */ - buf[0] = '\0'; - opts = buf; - } - break; - default: - /* NOTREACHED */ - assert(false); - } - - for (j = 0; opts[j] != '\0'; j++) { - unsigned k, nreps; - bool nseen; - - /* Parse repetition count, if any. */ - for (nreps = 0, nseen = false;; j++, nseen = true) { - switch (opts[j]) { - case '0': case '1': case '2': case '3': - case '4': case '5': case '6': case '7': - case '8': case '9': - nreps *= 10; - nreps += opts[j] - '0'; - break; - default: - goto MALLOC_OUT; - } - } -MALLOC_OUT: - if (nseen == false) - nreps = 1; - - for (k = 0; k < nreps; k++) { - switch (opts[j]) { - case 'a': - opt_abort = false; - break; - case 'A': - opt_abort = true; - break; - case 'b': -#ifdef MALLOC_BALANCE - opt_balance_threshold >>= 1; -#endif - break; - case 'B': -#ifdef MALLOC_BALANCE - if (opt_balance_threshold == 0) - opt_balance_threshold = 1; - else if ((opt_balance_threshold << 1) - > opt_balance_threshold) - opt_balance_threshold <<= 1; -#endif - break; - case 'c': - if (opt_cspace_max_2pow - 1 > - opt_qspace_max_2pow && - opt_cspace_max_2pow > - CACHELINE_2POW) - opt_cspace_max_2pow--; - break; - case 'C': - if (opt_cspace_max_2pow < pagesize_2pow - - 1) - opt_cspace_max_2pow++; - break; - case 'd': -#ifdef MALLOC_DSS - opt_dss = false; -#endif - break; - case 'D': -#ifdef MALLOC_DSS - opt_dss = true; -#endif - break; - case 'f': - opt_dirty_max >>= 1; - break; - case 'F': - if (opt_dirty_max == 0) - opt_dirty_max = 1; - else if ((opt_dirty_max << 1) != 0) - opt_dirty_max <<= 1; - break; -#ifdef MALLOC_MAG - case 'g': - opt_mag = false; - break; - case 'G': - opt_mag = true; - break; -#endif - case 'j': - opt_junk = false; - break; - case 'J': - opt_junk = true; - break; - case 'k': - /* - * Chunks always require at least one - * header page, so chunks can never be - * smaller than two pages. - */ - if (opt_chunk_2pow > pagesize_2pow + 1) - opt_chunk_2pow--; - break; - case 'K': - if (opt_chunk_2pow + 1 < - (sizeof(size_t) << 3)) - opt_chunk_2pow++; - break; - case 'm': -#ifdef MALLOC_DSS - opt_mmap = false; -#endif - break; - case 'M': -#ifdef MALLOC_DSS - opt_mmap = true; -#endif - break; - case 'n': - opt_narenas_lshift--; - break; - case 'N': - opt_narenas_lshift++; - break; - case 'p': - opt_print_stats = false; - break; - case 'P': - opt_print_stats = true; - break; - case 'q': - if (opt_qspace_max_2pow > QUANTUM_2POW) - opt_qspace_max_2pow--; - break; - case 'Q': - if (opt_qspace_max_2pow + 1 < - opt_cspace_max_2pow) - opt_qspace_max_2pow++; - break; -#ifdef MALLOC_MAG - case 'R': - if (opt_mag_size_2pow + 1 < (8U << - SIZEOF_PTR_2POW)) - opt_mag_size_2pow++; - break; - case 'r': - /* - * Make sure there's always at least - * one round per magazine. - */ - if ((1U << (opt_mag_size_2pow-1)) >= - sizeof(mag_t)) - opt_mag_size_2pow--; - break; -#endif - case 'u': - opt_utrace = false; - break; - case 'U': - opt_utrace = true; - break; - case 'v': - opt_sysv = false; - break; - case 'V': - opt_sysv = true; - break; - case 'x': - opt_xmalloc = false; - break; - case 'X': - opt_xmalloc = true; - break; - case 'z': - opt_zero = false; - break; - case 'Z': - opt_zero = true; - break; - default: { - char cbuf[2]; - - cbuf[0] = opts[j]; - cbuf[1] = '\0'; - _malloc_message(_getprogname(), - ": (malloc) Unsupported character " - "in malloc options: '", cbuf, - "'\n"); - } - } - } - } - } - -#ifdef MALLOC_DSS - /* Make sure that there is some method for acquiring memory. */ - if (opt_dss == false && opt_mmap == false) - opt_mmap = true; -#endif - - /* Take care to call atexit() only once. */ - if (opt_print_stats) { - /* Print statistics at exit. */ - atexit(malloc_print_stats); - } - - /* Register fork handlers. */ - pthread_atfork(_malloc_prefork, _malloc_postfork, _malloc_postfork); - -#ifdef MALLOC_MAG - /* - * Calculate the actual number of rounds per magazine, taking into - * account header overhead. - */ - max_rounds = (1LLU << (opt_mag_size_2pow - SIZEOF_PTR_2POW)) - - (sizeof(mag_t) >> SIZEOF_PTR_2POW) + 1; -#endif - - /* Set variables according to the value of opt_[qc]space_max_2pow. */ - qspace_max = (1U << opt_qspace_max_2pow); - cspace_min = CACHELINE_CEILING(qspace_max); - if (cspace_min == qspace_max) - cspace_min += CACHELINE; - cspace_max = (1U << opt_cspace_max_2pow); - sspace_min = SUBPAGE_CEILING(cspace_max); - if (sspace_min == cspace_max) - sspace_min += SUBPAGE; - assert(sspace_min < pagesize); - sspace_max = pagesize - SUBPAGE; - -#ifdef MALLOC_TINY - assert(QUANTUM_2POW >= TINY_MIN_2POW); -#endif - assert(ntbins <= QUANTUM_2POW); - nqbins = qspace_max >> QUANTUM_2POW; - ncbins = ((cspace_max - cspace_min) >> CACHELINE_2POW) + 1; - nsbins = ((sspace_max - sspace_min) >> SUBPAGE_2POW) + 1; - nbins = ntbins + nqbins + ncbins + nsbins; - - if (size2bin_init()) { - malloc_mutex_unlock(&init_lock); - return (true); - } - - /* Set variables according to the value of opt_chunk_2pow. */ - chunksize = (1LU << opt_chunk_2pow); - chunksize_mask = chunksize - 1; - chunk_npages = (chunksize >> pagesize_2pow); - { - size_t header_size; - - /* - * Compute the header size such that it is large enough to - * contain the page map. - */ - header_size = sizeof(arena_chunk_t) + - (sizeof(arena_chunk_map_t) * (chunk_npages - 1)); - arena_chunk_header_npages = (header_size >> pagesize_2pow) + - ((header_size & pagesize_mask) != 0); - } - arena_maxclass = chunksize - (arena_chunk_header_npages << - pagesize_2pow); - - UTRACE(0, 0, 0); - -#ifdef MALLOC_STATS - memset(&stats_chunks, 0, sizeof(chunk_stats_t)); -#endif - - /* Various sanity checks that regard configuration. */ - assert(chunksize >= pagesize); - - /* Initialize chunks data. */ - if (malloc_mutex_init(&huge_mtx)) { - malloc_mutex_unlock(&init_lock); - return (true); - } - extent_tree_ad_new(&huge); -#ifdef MALLOC_DSS - if (malloc_mutex_init(&dss_mtx)) { - malloc_mutex_unlock(&init_lock); - return (true); - } - dss_base = sbrk(0); - dss_prev = dss_base; - dss_max = dss_base; - extent_tree_szad_new(&dss_chunks_szad); - extent_tree_ad_new(&dss_chunks_ad); -#endif -#ifdef MALLOC_STATS - huge_nmalloc = 0; - huge_ndalloc = 0; - huge_allocated = 0; -#endif - - /* Initialize base allocation data structures. */ -#ifdef MALLOC_STATS - base_mapped = 0; -#endif -#ifdef MALLOC_DSS - /* - * Allocate a base chunk here, since it doesn't actually have to be - * chunk-aligned. Doing this before allocating any other chunks allows - * the use of space that would otherwise be wasted. - */ - if (opt_dss) - base_pages_alloc(0); -#endif - base_nodes = NULL; - if (malloc_mutex_init(&base_mtx)) { - malloc_mutex_unlock(&init_lock); - return (true); - } - - if (ncpus > 1) { - /* - * For SMP systems, create twice as many arenas as there are - * CPUs by default. - */ - opt_narenas_lshift++; - } - - /* Determine how many arenas to use. */ - narenas = ncpus; - if (opt_narenas_lshift > 0) { - if ((narenas << opt_narenas_lshift) > narenas) - narenas <<= opt_narenas_lshift; - /* - * Make sure not to exceed the limits of what base_alloc() can - * handle. - */ - if (narenas * sizeof(arena_t *) > chunksize) - narenas = chunksize / sizeof(arena_t *); - } else if (opt_narenas_lshift < 0) { - if ((narenas >> -opt_narenas_lshift) < narenas) - narenas >>= -opt_narenas_lshift; - /* Make sure there is at least one arena. */ - if (narenas == 0) - narenas = 1; - } -#ifdef MALLOC_BALANCE - assert(narenas != 0); - for (narenas_2pow = 0; - (narenas >> (narenas_2pow + 1)) != 0; - narenas_2pow++); -#endif - -#ifdef NO_TLS - if (narenas > 1) { - static const unsigned primes[] = {1, 3, 5, 7, 11, 13, 17, 19, - 23, 29, 31, 37, 41, 43, 47, 53, 59, 61, 67, 71, 73, 79, 83, - 89, 97, 101, 103, 107, 109, 113, 127, 131, 137, 139, 149, - 151, 157, 163, 167, 173, 179, 181, 191, 193, 197, 199, 211, - 223, 227, 229, 233, 239, 241, 251, 257, 263}; - unsigned nprimes, parenas; - - /* - * Pick a prime number of hash arenas that is more than narenas - * so that direct hashing of pthread_self() pointers tends to - * spread allocations evenly among the arenas. - */ - assert((narenas & 1) == 0); /* narenas must be even. */ - nprimes = (sizeof(primes) >> SIZEOF_INT_2POW); - parenas = primes[nprimes - 1]; /* In case not enough primes. */ - for (i = 1; i < nprimes; i++) { - if (primes[i] > narenas) { - parenas = primes[i]; - break; - } - } - narenas = parenas; - } -#endif - -#ifndef NO_TLS -# ifndef MALLOC_BALANCE - next_arena = 0; -# endif -#endif - - /* Allocate and initialize arenas. */ - arenas = (arena_t **)base_alloc(sizeof(arena_t *) * narenas); - if (arenas == NULL) { - malloc_mutex_unlock(&init_lock); - return (true); - } - /* - * Zero the array. In practice, this should always be pre-zeroed, - * since it was just mmap()ed, but let's be sure. - */ - memset(arenas, 0, sizeof(arena_t *) * narenas); - - /* - * Initialize one arena here. The rest are lazily created in - * choose_arena_hard(). - */ - arenas_extend(0); - if (arenas[0] == NULL) { - malloc_mutex_unlock(&init_lock); - return (true); - } -#ifndef NO_TLS - /* - * Assign the initial arena to the initial thread, in order to avoid - * spurious creation of an extra arena if the application switches to - * threaded mode. - */ - arenas_map = arenas[0]; -#endif - /* - * Seed here for the initial thread, since choose_arena_hard() is only - * called for other threads. The seed value doesn't really matter. - */ -#ifdef MALLOC_BALANCE - SPRN(balance, 42); -#endif - - malloc_spin_init(&arenas_lock); - - malloc_initialized = true; - malloc_mutex_unlock(&init_lock); - return (false); -} - -/* - * End general internal functions. - */ -/******************************************************************************/ -/* - * Begin malloc(3)-compatible functions. - */ - -void * -malloc(size_t size) -{ - void *ret; - - if (malloc_init()) { - ret = NULL; - goto RETURN; - } - - if (size == 0) { - if (opt_sysv == false) - size = 1; - else { - ret = NULL; - goto RETURN; - } - } - - ret = imalloc(size); - -RETURN: - if (ret == NULL) { - if (opt_xmalloc) { - _malloc_message(_getprogname(), - ": (malloc) Error in malloc(): out of memory\n", "", - ""); - abort(); - } - errno = ENOMEM; - } - - UTRACE(0, size, ret); - return (ret); -} - -int -posix_memalign(void **memptr, size_t alignment, size_t size) -{ - int ret; - void *result; - - if (malloc_init()) - result = NULL; - else { - /* Make sure that alignment is a large enough power of 2. */ - if (((alignment - 1) & alignment) != 0 - || alignment < sizeof(void *)) { - if (opt_xmalloc) { - _malloc_message(_getprogname(), - ": (malloc) Error in posix_memalign(): " - "invalid alignment\n", "", ""); - abort(); - } - result = NULL; - ret = EINVAL; - goto RETURN; - } - - result = ipalloc(alignment, size); - } - - if (result == NULL) { - if (opt_xmalloc) { - _malloc_message(_getprogname(), - ": (malloc) Error in posix_memalign(): out of memory\n", - "", ""); - abort(); - } - ret = ENOMEM; - goto RETURN; - } - - *memptr = result; - ret = 0; - -RETURN: - UTRACE(0, size, result); - return (ret); -} - -void * -calloc(size_t num, size_t size) -{ - void *ret; - size_t num_size; - - if (malloc_init()) { - num_size = 0; - ret = NULL; - goto RETURN; - } - - num_size = num * size; - if (num_size == 0) { - if ((opt_sysv == false) && ((num == 0) || (size == 0))) - num_size = 1; - else { - ret = NULL; - goto RETURN; - } - /* - * Try to avoid division here. We know that it isn't possible to - * overflow during multiplication if neither operand uses any of the - * most significant half of the bits in a size_t. - */ - } else if (((num | size) & (SIZE_T_MAX << (sizeof(size_t) << 2))) - && (num_size / size != num)) { - /* size_t overflow. */ - ret = NULL; - goto RETURN; - } - - ret = icalloc(num_size); - -RETURN: - if (ret == NULL) { - if (opt_xmalloc) { - _malloc_message(_getprogname(), - ": (malloc) Error in calloc(): out of memory\n", "", - ""); - abort(); - } - errno = ENOMEM; - } - - UTRACE(0, num_size, ret); - return (ret); -} - -void * -realloc(void *ptr, size_t size) -{ - void *ret; - - if (size == 0) { - if (opt_sysv == false) - size = 1; - else { - if (ptr != NULL) - idalloc(ptr); - ret = NULL; - goto RETURN; - } - } - - if (ptr != NULL) { - assert(malloc_initialized); - - ret = iralloc(ptr, size); - - if (ret == NULL) { - if (opt_xmalloc) { - _malloc_message(_getprogname(), - ": (malloc) Error in realloc(): out of " - "memory\n", "", ""); - abort(); - } - errno = ENOMEM; - } - } else { - if (malloc_init()) - ret = NULL; - else - ret = imalloc(size); - - if (ret == NULL) { - if (opt_xmalloc) { - _malloc_message(_getprogname(), - ": (malloc) Error in realloc(): out of " - "memory\n", "", ""); - abort(); - } - errno = ENOMEM; - } - } - -RETURN: - UTRACE(ptr, size, ret); - return (ret); -} - -void -free(void *ptr) -{ - - UTRACE(ptr, 0, 0); - if (ptr != NULL) { - assert(malloc_initialized); - - idalloc(ptr); - } -} - -/* - * End malloc(3)-compatible functions. - */ -/******************************************************************************/ -/* - * Begin non-standard functions. - */ - -size_t -malloc_usable_size(const void *ptr) -{ - - assert(ptr != NULL); - - return (isalloc(ptr)); -} - -/* - * End non-standard functions. - */ -/******************************************************************************/ -/* - * Begin library-private functions. - */ - -/******************************************************************************/ -/* - * Begin thread cache. - */ - -/* - * We provide an unpublished interface in order to receive notifications from - * the pthreads library whenever a thread exits. This allows us to clean up - * thread caches. - */ -void -_malloc_thread_cleanup(void) -{ - -#ifdef MALLOC_MAG - if (mag_rack != NULL) { - assert(mag_rack != (void *)-1); - mag_rack_destroy(mag_rack); -#ifdef MALLOC_DEBUG - mag_rack = (void *)-1; -#endif - } -#endif -} - -/* - * The following functions are used by threading libraries for protection of - * malloc during fork(). These functions are only called if the program is - * running in threaded mode, so there is no need to check whether the program - * is threaded here. - */ - -void -_malloc_prefork(void) -{ - unsigned i; - - /* Acquire all mutexes in a safe order. */ - - malloc_spin_lock(&arenas_lock); - for (i = 0; i < narenas; i++) { - if (arenas[i] != NULL) - malloc_spin_lock(&arenas[i]->lock); - } - malloc_spin_unlock(&arenas_lock); - - malloc_mutex_lock(&base_mtx); - - malloc_mutex_lock(&huge_mtx); - -#ifdef MALLOC_DSS - malloc_mutex_lock(&dss_mtx); -#endif -} - -void -_malloc_postfork(void) -{ - unsigned i; - - /* Release all mutexes, now that fork() has completed. */ - -#ifdef MALLOC_DSS - malloc_mutex_unlock(&dss_mtx); -#endif - - malloc_mutex_unlock(&huge_mtx); - - malloc_mutex_unlock(&base_mtx); - - malloc_spin_lock(&arenas_lock); - for (i = 0; i < narenas; i++) { - if (arenas[i] != NULL) - malloc_spin_unlock(&arenas[i]->lock); - } - malloc_spin_unlock(&arenas_lock); -} - -/* - * End library-private functions. - */ -/******************************************************************************/ diff --git a/lib/libjemalloc/malloc.3 b/lib/libjemalloc/malloc.3 deleted file mode 100644 index 67a52fb..0000000 --- a/lib/libjemalloc/malloc.3 +++ /dev/null @@ -1,584 +0,0 @@ -.\" Copyright (c) 1980, 1991, 1993 -.\" The Regents of the University of California. All rights reserved. -.\" -.\" This code is derived from software contributed to Berkeley by -.\" the American National Standards Committee X3, on Information -.\" Processing Systems. -.\" -.\" Redistribution and use in source and binary forms, with or without -.\" modification, are permitted provided that the following conditions -.\" are met: -.\" 1. Redistributions of source code must retain the above copyright -.\" notice, this list of conditions and the following disclaimer. -.\" 2. Redistributions in binary form must reproduce the above copyright -.\" notice, this list of conditions and the following disclaimer in the -.\" documentation and/or other materials provided with the distribution. -.\" 3. Neither the name of the University nor the names of its contributors -.\" may be used to endorse or promote products derived from this software -.\" without specific prior written permission. -.\" -.\" THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND -.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE -.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE -.\" ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE -.\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL -.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS -.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) -.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT -.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY -.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF -.\" SUCH DAMAGE. -.\" -.\" @(#)malloc.3 8.1 (Berkeley) 6/4/93 -.\" $FreeBSD: head/lib/libc/stdlib/malloc.3 182225 2008-08-27 02:00:53Z jasone $ -.\" -.Dd August 26, 2008 -.Dt MALLOC 3 -.Os -.Sh NAME -.Nm malloc , calloc , realloc , free , reallocf , malloc_usable_size -.Nd general purpose memory allocation functions -.Sh LIBRARY -.Lb libc -.Sh SYNOPSIS -.In stdlib.h -.Ft void * -.Fn malloc "size_t size" -.Ft void * -.Fn calloc "size_t number" "size_t size" -.Ft void * -.Fn realloc "void *ptr" "size_t size" -.Ft void * -.Fn reallocf "void *ptr" "size_t size" -.Ft void -.Fn free "void *ptr" -.Ft const char * -.Va _malloc_options ; -.Ft void -.Fo \*(lp*_malloc_message\*(rp -.Fa "const char *p1" "const char *p2" "const char *p3" "const char *p4" -.Fc -.In malloc_np.h -.Ft size_t -.Fn malloc_usable_size "const void *ptr" -.Sh DESCRIPTION -The -.Fn malloc -function allocates -.Fa size -bytes of uninitialized memory. -The allocated space is suitably aligned (after possible pointer coercion) -for storage of any type of object. -.Pp -The -.Fn calloc -function allocates space for -.Fa number -objects, -each -.Fa size -bytes in length. -The result is identical to calling -.Fn malloc -with an argument of -.Dq "number * size" , -with the exception that the allocated memory is explicitly initialized -to zero bytes. -.Pp -The -.Fn realloc -function changes the size of the previously allocated memory referenced by -.Fa ptr -to -.Fa size -bytes. -The contents of the memory are unchanged up to the lesser of the new and -old sizes. -If the new size is larger, -the contents of the newly allocated portion of the memory are undefined. -Upon success, the memory referenced by -.Fa ptr -is freed and a pointer to the newly allocated memory is returned. -Note that -.Fn realloc -and -.Fn reallocf -may move the memory allocation, resulting in a different return value than -.Fa ptr . -If -.Fa ptr -is -.Dv NULL , -the -.Fn realloc -function behaves identically to -.Fn malloc -for the specified size. -.Pp -The -.Fn reallocf -function is identical to the -.Fn realloc -function, except that it -will free the passed pointer when the requested memory cannot be allocated. -This is a -.Fx -specific API designed to ease the problems with traditional coding styles -for realloc causing memory leaks in libraries. -.Pp -The -.Fn free -function causes the allocated memory referenced by -.Fa ptr -to be made available for future allocations. -If -.Fa ptr -is -.Dv NULL , -no action occurs. -.Pp -The -.Fn malloc_usable_size -function returns the usable size of the allocation pointed to by -.Fa ptr . -The return value may be larger than the size that was requested during -allocation. -The -.Fn malloc_usable_size -function is not a mechanism for in-place -.Fn realloc ; -rather it is provided solely as a tool for introspection purposes. -Any discrepancy between the requested allocation size and the size reported by -.Fn malloc_usable_size -should not be depended on, since such behavior is entirely -implementation-dependent. -.Sh TUNING -Once, when the first call is made to one of these memory allocation -routines, various flags will be set or reset, which affects the -workings of this allocator implementation. -.Pp -The -.Dq name -of the file referenced by the symbolic link named -.Pa /etc/malloc.conf , -the value of the environment variable -.Ev MALLOC_OPTIONS , -and the string pointed to by the global variable -.Va _malloc_options -will be interpreted, in that order, from left to right as flags. -.Pp -Each flag is a single letter, optionally prefixed by a non-negative base 10 -integer repetition count. -For example, -.Dq 3N -is equivalent to -.Dq NNN . -Some flags control parameter magnitudes, where uppercase increases the -magnitude, and lowercase decreases the magnitude. -Other flags control boolean parameters, where uppercase indicates that a -behavior is set, or on, and lowercase means that a behavior is not set, or off. -.Bl -tag -width indent -.It A -All warnings (except for the warning about unknown -flags being set) become fatal. -The process will call -.Xr abort 3 -in these cases. -.It B -Double/halve the per-arena lock contention threshold at which a thread is -randomly re-assigned to an arena. -This dynamic load balancing tends to push threads away from highly contended -arenas, which avoids worst case contention scenarios in which threads -disproportionately utilize arenas. -However, due to the highly dynamic load that applications may place on the -allocator, it is impossible for the allocator to know in advance how sensitive -it should be to contention over arenas. -Therefore, some applications may benefit from increasing or decreasing this -threshold parameter. -This option is not available for some configurations (non-PIC). -.It C -Double/halve the size of the maximum size class that is a multiple of the -cacheline size (64). -Above this size, subpage spacing (256 bytes) is used for size classes. -The default value is 512 bytes. -.It D -Use -.Xr sbrk 2 -to acquire memory in the data storage segment (DSS). -This option is enabled by default. -See the -.Dq M -option for related information and interactions. -.It F -Double/halve the per-arena maximum number of dirty unused pages that are -allowed to accumulate before informing the kernel about at least half of those -pages via -.Xr madvise 2 . -This provides the kernel with sufficient information to recycle dirty pages if -physical memory becomes scarce and the pages remain unused. -The default is 512 pages per arena; -.Ev MALLOC_OPTIONS=10f -will prevent any dirty unused pages from accumulating. -.It G -When there are multiple threads, use thread-specific caching for objects that -are smaller than one page. -This option is enabled by default. -Thread-specific caching allows many allocations to be satisfied without -performing any thread synchronization, at the cost of increased memory use. -See the -.Dq R -option for related tuning information. -This option is not available for some configurations (non-PIC). -.It J -Each byte of new memory allocated by -.Fn malloc , -.Fn realloc -or -.Fn reallocf -will be initialized to 0xa5. -All memory returned by -.Fn free , -.Fn realloc -or -.Fn reallocf -will be initialized to 0x5a. -This is intended for debugging and will impact performance negatively. -.It K -Double/halve the virtual memory chunk size. -The default chunk size is 1 MB. -.It M -Use -.Xr mmap 2 -to acquire anonymously mapped memory. -This option is enabled by default. -If both the -.Dq D -and -.Dq M -options are enabled, the allocator prefers the DSS over anonymous mappings, -but allocation only fails if memory cannot be acquired via either method. -If neither option is enabled, then the -.Dq M -option is implicitly enabled in order to assure that there is a method for -acquiring memory. -.It N -Double/halve the number of arenas. -The default number of arenas is two times the number of CPUs, or one if there -is a single CPU. -.It P -Various statistics are printed at program exit via an -.Xr atexit 3 -function. -This has the potential to cause deadlock for a multi-threaded process that exits -while one or more threads are executing in the memory allocation functions. -Therefore, this option should only be used with care; it is primarily intended -as a performance tuning aid during application development. -.It Q -Double/halve the size of the maximum size class that is a multiple of the -quantum (8 or 16 bytes, depending on architecture). -Above this size, cacheline spacing is used for size classes. -The default value is 128 bytes. -.It R -Double/halve magazine size, which approximately doubles/halves the number of -rounds in each magazine. -Magazines are used by the thread-specific caching machinery to acquire and -release objects in bulk. -Increasing the magazine size decreases locking overhead, at the expense of -increased memory usage. -This option is not available for some configurations (non-PIC). -.It U -Generate -.Dq utrace -entries for -.Xr ktrace 1 , -for all operations. -Consult the source for details on this option. -.It V -Attempting to allocate zero bytes will return a -.Dv NULL -pointer instead of -a valid pointer. -(The default behavior is to make a minimal allocation and return a -pointer to it.) -This option is provided for System V compatibility. -This option is incompatible with the -.Dq X -option. -.It X -Rather than return failure for any allocation function, -display a diagnostic message on -.Dv stderr -and cause the program to drop -core (using -.Xr abort 3 ) . -This option should be set at compile time by including the following in -the source code: -.Bd -literal -offset indent -_malloc_options = "X"; -.Ed -.It Z -Each byte of new memory allocated by -.Fn malloc , -.Fn realloc -or -.Fn reallocf -will be initialized to 0. -Note that this initialization only happens once for each byte, so -.Fn realloc -and -.Fn reallocf -calls do not zero memory that was previously allocated. -This is intended for debugging and will impact performance negatively. -.El -.Pp -The -.Dq J -and -.Dq Z -options are intended for testing and debugging. -An application which changes its behavior when these options are used -is flawed. -.Sh IMPLEMENTATION NOTES -Traditionally, allocators have used -.Xr sbrk 2 -to obtain memory, which is suboptimal for several reasons, including race -conditions, increased fragmentation, and artificial limitations on maximum -usable memory. -This allocator uses both -.Xr sbrk 2 -and -.Xr mmap 2 -by default, but it can be configured at run time to use only one or the other. -If resource limits are not a primary concern, the preferred configuration is -.Ev MALLOC_OPTIONS=dM -or -.Ev MALLOC_OPTIONS=DM . -When so configured, the -.Ar datasize -resource limit has little practical effect for typical applications; use -.Ev MALLOC_OPTIONS=Dm -if that is a concern. -Regardless of allocator configuration, the -.Ar vmemoryuse -resource limit can be used to bound the total virtual memory used by a -process, as described in -.Xr limits 1 . -.Pp -This allocator uses multiple arenas in order to reduce lock contention for -threaded programs on multi-processor systems. -This works well with regard to threading scalability, but incurs some costs. -There is a small fixed per-arena overhead, and additionally, arenas manage -memory completely independently of each other, which means a small fixed -increase in overall memory fragmentation. -These overheads are not generally an issue, given the number of arenas normally -used. -Note that using substantially more arenas than the default is not likely to -improve performance, mainly due to reduced cache performance. -However, it may make sense to reduce the number of arenas if an application -does not make much use of the allocation functions. -.Pp -In addition to multiple arenas, this allocator supports thread-specific -caching for small objects (smaller than one page), in order to make it -possible to completely avoid synchronization for most small allocation requests. -Such caching allows very fast allocation in the common case, but it increases -memory usage and fragmentation, since a bounded number of objects can remain -allocated in each thread cache. -.Pp -Memory is conceptually broken into equal-sized chunks, where the chunk size is -a power of two that is greater than the page size. -Chunks are always aligned to multiples of the chunk size. -This alignment makes it possible to find metadata for user objects very -quickly. -.Pp -User objects are broken into three categories according to size: small, large, -and huge. -Small objects are smaller than one page. -Large objects are smaller than the chunk size. -Huge objects are a multiple of the chunk size. -Small and large objects are managed by arenas; huge objects are managed -separately in a single data structure that is shared by all threads. -Huge objects are used by applications infrequently enough that this single -data structure is not a scalability issue. -.Pp -Each chunk that is managed by an arena tracks its contents as runs of -contiguous pages (unused, backing a set of small objects, or backing one large -object). -The combination of chunk alignment and chunk page maps makes it possible to -determine all metadata regarding small and large allocations in constant time. -.Pp -Small objects are managed in groups by page runs. -Each run maintains a bitmap that tracks which regions are in use. -Allocation requests that are no more than half the quantum (8 or 16, depending -on architecture) are rounded up to the nearest power of two. -Allocation requests that are more than half the quantum, but no more than the -minimum cacheline-multiple size class (see the -.Dq Q -option) are rounded up to the nearest multiple of the quantum. -Allocation requests that are more than the minumum cacheline-multiple size -class, but no more than the minimum subpage-multiple size class (see the -.Dq C -option) are rounded up to the nearest multiple of the cacheline size (64). -Allocation requests that are more than the minimum subpage-multiple size class -are rounded up to the nearest multiple of the subpage size (256). -Allocation requests that are more than one page, but small enough to fit in -an arena-managed chunk (see the -.Dq K -option), are rounded up to the nearest run size. -Allocation requests that are too large to fit in an arena-managed chunk are -rounded up to the nearest multiple of the chunk size. -.Pp -Allocations are packed tightly together, which can be an issue for -multi-threaded applications. -If you need to assure that allocations do not suffer from cacheline sharing, -round your allocation requests up to the nearest multiple of the cacheline -size. -.Sh DEBUGGING MALLOC PROBLEMS -The first thing to do is to set the -.Dq A -option. -This option forces a coredump (if possible) at the first sign of trouble, -rather than the normal policy of trying to continue if at all possible. -.Pp -It is probably also a good idea to recompile the program with suitable -options and symbols for debugger support. -.Pp -If the program starts to give unusual results, coredump or generally behave -differently without emitting any of the messages mentioned in the next -section, it is likely because it depends on the storage being filled with -zero bytes. -Try running it with the -.Dq Z -option set; -if that improves the situation, this diagnosis has been confirmed. -If the program still misbehaves, -the likely problem is accessing memory outside the allocated area. -.Pp -Alternatively, if the symptoms are not easy to reproduce, setting the -.Dq J -option may help provoke the problem. -.Pp -In truly difficult cases, the -.Dq U -option, if supported by the kernel, can provide a detailed trace of -all calls made to these functions. -.Pp -Unfortunately this implementation does not provide much detail about -the problems it detects; the performance impact for storing such information -would be prohibitive. -There are a number of allocator implementations available on the Internet -which focus on detecting and pinpointing problems by trading performance for -extra sanity checks and detailed diagnostics. -.Sh DIAGNOSTIC MESSAGES -If any of the memory allocation/deallocation functions detect an error or -warning condition, a message will be printed to file descriptor -.Dv STDERR_FILENO . -Errors will result in the process dumping core. -If the -.Dq A -option is set, all warnings are treated as errors. -.Pp -The -.Va _malloc_message -variable allows the programmer to override the function which emits -the text strings forming the errors and warnings if for some reason -the -.Dv stderr -file descriptor is not suitable for this. -Please note that doing anything which tries to allocate memory in -this function is likely to result in a crash or deadlock. -.Pp -All messages are prefixed by -.Dq Ao Ar progname Ac Ns Li : (malloc) . -.Sh RETURN VALUES -The -.Fn malloc -and -.Fn calloc -functions return a pointer to the allocated memory if successful; otherwise -a -.Dv NULL -pointer is returned and -.Va errno -is set to -.Er ENOMEM . -.Pp -The -.Fn realloc -and -.Fn reallocf -functions return a pointer, possibly identical to -.Fa ptr , -to the allocated memory -if successful; otherwise a -.Dv NULL -pointer is returned, and -.Va errno -is set to -.Er ENOMEM -if the error was the result of an allocation failure. -The -.Fn realloc -function always leaves the original buffer intact -when an error occurs, whereas -.Fn reallocf -deallocates it in this case. -.Pp -The -.Fn free -function returns no value. -.Pp -The -.Fn malloc_usable_size -function returns the usable size of the allocation pointed to by -.Fa ptr . -.Sh ENVIRONMENT -The following environment variables affect the execution of the allocation -functions: -.Bl -tag -width ".Ev MALLOC_OPTIONS" -.It Ev MALLOC_OPTIONS -If the environment variable -.Ev MALLOC_OPTIONS -is set, the characters it contains will be interpreted as flags to the -allocation functions. -.El -.Sh EXAMPLES -To dump core whenever a problem occurs: -.Pp -.Bd -literal -offset indent -ln -s 'A' /etc/malloc.conf -.Ed -.Pp -To specify in the source that a program does no return value checking -on calls to these functions: -.Bd -literal -offset indent -_malloc_options = "X"; -.Ed -.Sh SEE ALSO -.Xr limits 1 , -.Xr madvise 2 , -.Xr mmap 2 , -.Xr sbrk 2 , -.Xr alloca 3 , -.Xr atexit 3 , -.Xr getpagesize 3 , -.Xr memory 3 , -.Xr posix_memalign 3 -.Sh STANDARDS -The -.Fn malloc , -.Fn calloc , -.Fn realloc -and -.Fn free -functions conform to -.St -isoC . -.Sh HISTORY -The -.Fn reallocf -function first appeared in -.Fx 3.0 . -.Pp -The -.Fn malloc_usable_size -function first appeared in -.Fx 7.0 . diff --git a/lib/libjemalloc/malloc.c b/lib/libjemalloc/malloc.c deleted file mode 100644 index 56d8a98..0000000 --- a/lib/libjemalloc/malloc.c +++ /dev/null @@ -1,5594 +0,0 @@ -/*- - * Copyright (C) 2006-2008 Jason Evans . - * All rights reserved. - * - * Redistribution and use in source and binary forms, with or without - * modification, are permitted provided that the following conditions - * are met: - * 1. Redistributions of source code must retain the above copyright - * notice(s), this list of conditions and the following disclaimer as - * the first lines of this file unmodified other than the possible - * addition of one or more copyright notices. - * 2. Redistributions in binary form must reproduce the above copyright - * notice(s), this list of conditions and the following disclaimer in - * the documentation and/or other materials provided with the - * distribution. - * - * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDER(S) ``AS IS'' AND ANY - * EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE - * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR - * PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER(S) BE - * LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR - * CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF - * SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR - * BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, - * WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE - * OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, - * EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. - * - ******************************************************************************* - * - * This allocator implementation is designed to provide scalable performance - * for multi-threaded programs on multi-processor systems. The following - * features are included for this purpose: - * - * + Multiple arenas are used if there are multiple CPUs, which reduces lock - * contention and cache sloshing. - * - * + Thread-specific caching is used if there are multiple threads, which - * reduces the amount of locking. - * - * + Cache line sharing between arenas is avoided for internal data - * structures. - * - * + Memory is managed in chunks and runs (chunks can be split into runs), - * rather than as individual pages. This provides a constant-time - * mechanism for associating allocations with particular arenas. - * - * Allocation requests are rounded up to the nearest size class, and no record - * of the original request size is maintained. Allocations are broken into - * categories according to size class. Assuming runtime defaults, 4 kB pages - * and a 16 byte quantum on a 32-bit system, the size classes in each category - * are as follows: - * - * |=======================================| - * | Category | Subcategory | Size | - * |=======================================| - * | Small | Tiny | 2 | - * | | | 4 | - * | | | 8 | - * | |------------------+---------| - * | | Quantum-spaced | 16 | - * | | | 32 | - * | | | 48 | - * | | | ... | - * | | | 96 | - * | | | 112 | - * | | | 128 | - * | |------------------+---------| - * | | Cacheline-spaced | 192 | - * | | | 256 | - * | | | 320 | - * | | | 384 | - * | | | 448 | - * | | | 512 | - * | |------------------+---------| - * | | Sub-page | 760 | - * | | | 1024 | - * | | | 1280 | - * | | | ... | - * | | | 3328 | - * | | | 3584 | - * | | | 3840 | - * |=======================================| - * | Large | 4 kB | - * | | 8 kB | - * | | 12 kB | - * | | ... | - * | | 1012 kB | - * | | 1016 kB | - * | | 1020 kB | - * |=======================================| - * | Huge | 1 MB | - * | | 2 MB | - * | | 3 MB | - * | | ... | - * |=======================================| - * - * A different mechanism is used for each category: - * - * Small : Each size class is segregated into its own set of runs. Each run - * maintains a bitmap of which regions are free/allocated. - * - * Large : Each allocation is backed by a dedicated run. Metadata are stored - * in the associated arena chunk header maps. - * - * Huge : Each allocation is backed by a dedicated contiguous set of chunks. - * Metadata are stored in a separate red-black tree. - * - ******************************************************************************* - */ - -/* - * MALLOC_PRODUCTION disables assertions and statistics gathering. It also - * defaults the A and J runtime options to off. These settings are appropriate - * for production systems. - */ -/* #define MALLOC_PRODUCTION */ - -#ifndef MALLOC_PRODUCTION - /* - * MALLOC_DEBUG enables assertions and other sanity checks, and disables - * inline functions. - */ -# define MALLOC_DEBUG - - /* MALLOC_STATS enables statistics calculation. */ -# define MALLOC_STATS -#endif - -/* - * MALLOC_TINY enables support for tiny objects, which are smaller than one - * quantum. - */ -#define MALLOC_TINY - -/* - * MALLOC_MAG enables a magazine-based thread-specific caching layer for small - * objects. This makes it possible to allocate/deallocate objects without any - * locking when the cache is in the steady state. - */ -#define MALLOC_MAG - -/* - * MALLOC_BALANCE enables monitoring of arena lock contention and dynamically - * re-balances arena load if exponentially averaged contention exceeds a - * certain threshold. - */ -#define MALLOC_BALANCE - -/* - * MALLOC_DSS enables use of sbrk(2) to allocate chunks from the data storage - * segment (DSS). In an ideal world, this functionality would be completely - * unnecessary, but we are burdened by history and the lack of resource limits - * for anonymous mapped memory. - */ -#define MALLOC_DSS - -#include -__FBSDID("$FreeBSD: head/lib/libc/stdlib/malloc.c 182225 2008-08-27 02:00:53Z jasone $"); - -#include "libc_private.h" -#ifdef MALLOC_DEBUG -# define _LOCK_DEBUG -#endif -#include "spinlock.h" -#include "namespace.h" -#include -#include -#include -#include -#include -#include -#include -#include /* Must come after several other sys/ includes. */ - -#include -#include - -#include -#include -#include -#include -#include -#include -#include -#include -#include -#include -#include - -#include "un-namespace.h" - -#ifdef MALLOC_DEBUG -# ifdef NDEBUG -# undef NDEBUG -# endif -#else -# ifndef NDEBUG -# define NDEBUG -# endif -#endif -#include - -#include "rb.h" - -#ifdef MALLOC_DEBUG - /* Disable inlining to make debugging easier. */ -# define inline -#endif - -/* Size of stack-allocated buffer passed to strerror_r(). */ -#define STRERROR_BUF 64 - -/* - * The const_size2bin table is sized according to PAGESIZE_2POW, but for - * correctness reasons, we never assume that - * (pagesize == (1U << * PAGESIZE_2POW)). - * - * Minimum alignment of allocations is 2^QUANTUM_2POW bytes. - */ -#ifdef __i386__ -# define PAGESIZE_2POW 12 -# define QUANTUM_2POW 4 -# define SIZEOF_PTR_2POW 2 -# define CPU_SPINWAIT __asm__ volatile("pause") -#endif -#ifdef __ia64__ -# define PAGESIZE_2POW 12 -# define QUANTUM_2POW 4 -# define SIZEOF_PTR_2POW 3 -#endif -#ifdef __alpha__ -# define PAGESIZE_2POW 13 -# define QUANTUM_2POW 4 -# define SIZEOF_PTR_2POW 3 -# define NO_TLS -#endif -#ifdef __sparc64__ -# define PAGESIZE_2POW 13 -# define QUANTUM_2POW 4 -# define SIZEOF_PTR_2POW 3 -# define NO_TLS -#endif -#ifdef __amd64__ -# define PAGESIZE_2POW 12 -# define QUANTUM_2POW 4 -# define SIZEOF_PTR_2POW 3 -# define CPU_SPINWAIT __asm__ volatile("pause") -#endif -#ifdef __arm__ -# define PAGESIZE_2POW 12 -# define QUANTUM_2POW 3 -# define SIZEOF_PTR_2POW 2 -# define NO_TLS -#endif -#ifdef __mips__ -# define PAGESIZE_2POW 12 -# define QUANTUM_2POW 3 -# define SIZEOF_PTR_2POW 2 -# define NO_TLS -#endif -#ifdef __powerpc__ -# define PAGESIZE_2POW 12 -# define QUANTUM_2POW 4 -# define SIZEOF_PTR_2POW 2 -#endif -#ifdef __sh__ -# define PAGESIZE_2POW 12 -# define QUANTUM_2POW 3 -# define SIZEOF_PTR_2POW 2 -# define NO_TLS -#endif - -#define QUANTUM ((size_t)(1U << QUANTUM_2POW)) -#define QUANTUM_MASK (QUANTUM - 1) - -#define SIZEOF_PTR (1U << SIZEOF_PTR_2POW) - -/* sizeof(int) == (1U << SIZEOF_INT_2POW). */ -#ifndef SIZEOF_INT_2POW -# define SIZEOF_INT_2POW 2 -#endif - -/* We can't use TLS in non-PIC programs, since TLS relies on loader magic. */ -#if (!defined(PIC) && !defined(NO_TLS)) -# define NO_TLS -#endif - -#ifdef NO_TLS - /* MALLOC_MAG requires TLS. */ -# ifdef MALLOC_MAG -# undef MALLOC_MAG -# endif - /* MALLOC_BALANCE requires TLS. */ -# ifdef MALLOC_BALANCE -# undef MALLOC_BALANCE -# endif -#endif - -/* - * Size and alignment of memory chunks that are allocated by the OS's virtual - * memory system. - */ -#define CHUNK_2POW_DEFAULT 20 - -/* Maximum number of dirty pages per arena. */ -#define DIRTY_MAX_DEFAULT (1U << 9) - -/* - * Maximum size of L1 cache line. This is used to avoid cache line aliasing. - * In addition, this controls the spacing of cacheline-spaced size classes. - */ -#define CACHELINE_2POW 6 -#define CACHELINE ((size_t)(1U << CACHELINE_2POW)) -#define CACHELINE_MASK (CACHELINE - 1) - -/* - * Subpages are an artificially designated partitioning of pages. Their only - * purpose is to support subpage-spaced size classes. - * - * There must be at least 4 subpages per page, due to the way size classes are - * handled. - */ -#define SUBPAGE_2POW 8 -#define SUBPAGE ((size_t)(1U << SUBPAGE_2POW)) -#define SUBPAGE_MASK (SUBPAGE - 1) - -#ifdef MALLOC_TINY - /* Smallest size class to support. */ -# define TINY_MIN_2POW 1 -#endif - -/* - * Maximum size class that is a multiple of the quantum, but not (necessarily) - * a power of 2. Above this size, allocations are rounded up to the nearest - * power of 2. - */ -#define QSPACE_MAX_2POW_DEFAULT 7 - -/* - * Maximum size class that is a multiple of the cacheline, but not (necessarily) - * a power of 2. Above this size, allocations are rounded up to the nearest - * power of 2. - */ -#define CSPACE_MAX_2POW_DEFAULT 9 - -/* - * RUN_MAX_OVRHD indicates maximum desired run header overhead. Runs are sized - * as small as possible such that this setting is still honored, without - * violating other constraints. The goal is to make runs as small as possible - * without exceeding a per run external fragmentation threshold. - * - * We use binary fixed point math for overhead computations, where the binary - * point is implicitly RUN_BFP bits to the left. - * - * Note that it is possible to set RUN_MAX_OVRHD low enough that it cannot be - * honored for some/all object sizes, since there is one bit of header overhead - * per object (plus a constant). This constraint is relaxed (ignored) for runs - * that are so small that the per-region overhead is greater than: - * - * (RUN_MAX_OVRHD / (reg_size << (3+RUN_BFP)) - */ -#define RUN_BFP 12 -/* \/ Implicit binary fixed point. */ -#define RUN_MAX_OVRHD 0x0000003dU -#define RUN_MAX_OVRHD_RELAX 0x00001800U - -/* Put a cap on small object run size. This overrides RUN_MAX_OVRHD. */ -#define RUN_MAX_SMALL (12 * pagesize) - -/* - * Hyper-threaded CPUs may need a special instruction inside spin loops in - * order to yield to another virtual CPU. If no such instruction is defined - * above, make CPU_SPINWAIT a no-op. - */ -#ifndef CPU_SPINWAIT -# define CPU_SPINWAIT -#endif - -/* - * Adaptive spinning must eventually switch to blocking, in order to avoid the - * potential for priority inversion deadlock. Backing off past a certain point - * can actually waste time. - */ -#define SPIN_LIMIT_2POW 11 - -/* - * Conversion from spinning to blocking is expensive; we use (1U << - * BLOCK_COST_2POW) to estimate how many more times costly blocking is than - * worst-case spinning. - */ -#define BLOCK_COST_2POW 4 - -#ifdef MALLOC_MAG - /* - * Default magazine size, in bytes. max_rounds is calculated to make - * optimal use of the space, leaving just enough room for the magazine - * header. - */ -# define MAG_SIZE_2POW_DEFAULT 9 -#endif - -#ifdef MALLOC_BALANCE - /* - * We use an exponential moving average to track recent lock contention, - * where the size of the history window is N, and alpha=2/(N+1). - * - * Due to integer math rounding, very small values here can cause - * substantial degradation in accuracy, thus making the moving average decay - * faster than it would with precise calculation. - */ -# define BALANCE_ALPHA_INV_2POW 9 - - /* - * Threshold value for the exponential moving contention average at which to - * re-assign a thread. - */ -# define BALANCE_THRESHOLD_DEFAULT (1U << (SPIN_LIMIT_2POW-4)) -#endif - -/******************************************************************************/ - -/* - * Mutexes based on spinlocks. We can't use normal pthread spinlocks in all - * places, because they require malloc()ed memory, which causes bootstrapping - * issues in some cases. - */ -typedef struct { - spinlock_t lock; -} malloc_mutex_t; - -/* Set to true once the allocator has been initialized. */ -static bool malloc_initialized = false; - -/* Used to avoid initialization races. */ -static malloc_mutex_t init_lock = {_SPINLOCK_INITIALIZER}; - -/******************************************************************************/ -/* - * Statistics data structures. - */ - -#ifdef MALLOC_STATS - -typedef struct malloc_bin_stats_s malloc_bin_stats_t; -struct malloc_bin_stats_s { - /* - * Number of allocation requests that corresponded to the size of this - * bin. - */ - uint64_t nrequests; - -#ifdef MALLOC_MAG - /* Number of magazine reloads from this bin. */ - uint64_t nmags; -#endif - - /* Total number of runs created for this bin's size class. */ - uint64_t nruns; - - /* - * Total number of runs reused by extracting them from the runs tree for - * this bin's size class. - */ - uint64_t reruns; - - /* High-water mark for this bin. */ - unsigned long highruns; - - /* Current number of runs in this bin. */ - unsigned long curruns; -}; - -typedef struct arena_stats_s arena_stats_t; -struct arena_stats_s { - /* Number of bytes currently mapped. */ - size_t mapped; - - /* - * Total number of purge sweeps, total number of madvise calls made, - * and total pages purged in order to keep dirty unused memory under - * control. - */ - uint64_t npurge; - uint64_t nmadvise; - uint64_t purged; - - /* Per-size-category statistics. */ - size_t allocated_small; - uint64_t nmalloc_small; - uint64_t ndalloc_small; - - size_t allocated_large; - uint64_t nmalloc_large; - uint64_t ndalloc_large; - -#ifdef MALLOC_BALANCE - /* Number of times this arena reassigned a thread due to contention. */ - uint64_t nbalance; -#endif -}; - -typedef struct chunk_stats_s chunk_stats_t; -struct chunk_stats_s { - /* Number of chunks that were allocated. */ - uint64_t nchunks; - - /* High-water mark for number of chunks allocated. */ - unsigned long highchunks; - - /* - * Current number of chunks allocated. This value isn't maintained for - * any other purpose, so keep track of it in order to be able to set - * highchunks. - */ - unsigned long curchunks; -}; - -#endif /* #ifdef MALLOC_STATS */ - -/******************************************************************************/ -/* - * Extent data structures. - */ - -/* Tree of extents. */ -typedef struct extent_node_s extent_node_t; -struct extent_node_s { -#ifdef MALLOC_DSS - /* Linkage for the size/address-ordered tree. */ - rb_node(extent_node_t) link_szad; -#endif - - /* Linkage for the address-ordered tree. */ - rb_node(extent_node_t) link_ad; - - /* Pointer to the extent that this tree node is responsible for. */ - void *addr; - - /* Total region size. */ - size_t size; -}; -typedef rb_tree(extent_node_t) extent_tree_t; - -/******************************************************************************/ -/* - * Arena data structures. - */ - -typedef struct arena_s arena_t; -typedef struct arena_bin_s arena_bin_t; - -/* Each element of the chunk map corresponds to one page within the chunk. */ -typedef struct arena_chunk_map_s arena_chunk_map_t; -struct arena_chunk_map_s { - /* - * Linkage for run trees. There are two disjoint uses: - * - * 1) arena_t's runs_avail tree. - * 2) arena_run_t conceptually uses this linkage for in-use non-full - * runs, rather than directly embedding linkage. - */ - rb_node(arena_chunk_map_t) link; - - /* - * Run address (or size) and various flags are stored together. The bit - * layout looks like (assuming 32-bit system): - * - * ???????? ???????? ????---- ---kdzla - * - * ? : Unallocated: Run address for first/last pages, unset for internal - * pages. - * Small: Run address. - * Large: Run size for first page, unset for trailing pages. - * - : Unused. - * k : key? - * d : dirty? - * z : zeroed? - * l : large? - * a : allocated? - * - * Following are example bit patterns for the three types of runs. - * - * r : run address - * s : run size - * x : don't care - * - : 0 - * [dzla] : bit set - * - * Unallocated: - * ssssssss ssssssss ssss---- -------- - * xxxxxxxx xxxxxxxx xxxx---- ----d--- - * ssssssss ssssssss ssss---- -----z-- - * - * Small: - * rrrrrrrr rrrrrrrr rrrr---- -------a - * rrrrrrrr rrrrrrrr rrrr---- -------a - * rrrrrrrr rrrrrrrr rrrr---- -------a - * - * Large: - * ssssssss ssssssss ssss---- ------la - * -------- -------- -------- ------la - * -------- -------- -------- ------la - */ - size_t bits; -#define CHUNK_MAP_KEY ((size_t)0x10U) -#define CHUNK_MAP_DIRTY ((size_t)0x08U) -#define CHUNK_MAP_ZEROED ((size_t)0x04U) -#define CHUNK_MAP_LARGE ((size_t)0x02U) -#define CHUNK_MAP_ALLOCATED ((size_t)0x01U) -}; -typedef rb_tree(arena_chunk_map_t) arena_avail_tree_t; -typedef rb_tree(arena_chunk_map_t) arena_run_tree_t; - -/* Arena chunk header. */ -typedef struct arena_chunk_s arena_chunk_t; -struct arena_chunk_s { - /* Arena that owns the chunk. */ - arena_t *arena; - - /* Linkage for the arena's chunks_dirty tree. */ - rb_node(arena_chunk_t) link_dirty; - - /* Number of dirty pages. */ - size_t ndirty; - - /* Map of pages within chunk that keeps track of free/large/small. */ - arena_chunk_map_t map[1]; /* Dynamically sized. */ -}; -typedef rb_tree(arena_chunk_t) arena_chunk_tree_t; - -typedef struct arena_run_s arena_run_t; -struct arena_run_s { -#ifdef MALLOC_DEBUG - uint32_t magic; -# define ARENA_RUN_MAGIC 0x384adf93 -#endif - - /* Bin this run is associated with. */ - arena_bin_t *bin; - - /* Index of first element that might have a free region. */ - unsigned regs_minelm; - - /* Number of free regions in run. */ - unsigned nfree; - - /* Bitmask of in-use regions (0: in use, 1: free). */ - unsigned regs_mask[1]; /* Dynamically sized. */ -}; - -struct arena_bin_s { - /* - * Current run being used to service allocations of this bin's size - * class. - */ - arena_run_t *runcur; - - /* - * Tree of non-full runs. This tree is used when looking for an - * existing run when runcur is no longer usable. We choose the - * non-full run that is lowest in memory; this policy tends to keep - * objects packed well, and it can also help reduce the number of - * almost-empty chunks. - */ - arena_run_tree_t runs; - - /* Size of regions in a run for this bin's size class. */ - size_t reg_size; - - /* Total size of a run for this bin's size class. */ - size_t run_size; - - /* Total number of regions in a run for this bin's size class. */ - uint32_t nregs; - - /* Number of elements in a run's regs_mask for this bin's size class. */ - uint32_t regs_mask_nelms; - - /* Offset of first region in a run for this bin's size class. */ - uint32_t reg0_offset; - -#ifdef MALLOC_STATS - /* Bin statistics. */ - malloc_bin_stats_t stats; -#endif -}; - -struct arena_s { -#ifdef MALLOC_DEBUG - uint32_t magic; -# define ARENA_MAGIC 0x947d3d24 -#endif - - /* All operations on this arena require that lock be locked. */ - pthread_mutex_t lock; - -#ifdef MALLOC_STATS - arena_stats_t stats; -#endif - - /* Tree of dirty-page-containing chunks this arena manages. */ - arena_chunk_tree_t chunks_dirty; - - /* - * In order to avoid rapid chunk allocation/deallocation when an arena - * oscillates right on the cusp of needing a new chunk, cache the most - * recently freed chunk. The spare is left in the arena's chunk trees - * until it is deleted. - * - * There is one spare chunk per arena, rather than one spare total, in - * order to avoid interactions between multiple threads that could make - * a single spare inadequate. - */ - arena_chunk_t *spare; - - /* - * Current count of pages within unused runs that are potentially - * dirty, and for which madvise(... MADV_FREE) has not been called. By - * tracking this, we can institute a limit on how much dirty unused - * memory is mapped for each arena. - */ - size_t ndirty; - - /* - * Size/address-ordered tree of this arena's available runs. This tree - * is used for first-best-fit run allocation. - */ - arena_avail_tree_t runs_avail; - -#ifdef MALLOC_BALANCE - /* - * The arena load balancing machinery needs to keep track of how much - * lock contention there is. This value is exponentially averaged. - */ - uint32_t contention; -#endif - - /* - * bins is used to store rings of free regions of the following sizes, - * assuming a 16-byte quantum, 4kB pagesize, and default MALLOC_OPTIONS. - * - * bins[i] | size | - * --------+------+ - * 0 | 2 | - * 1 | 4 | - * 2 | 8 | - * --------+------+ - * 3 | 16 | - * 4 | 32 | - * 5 | 48 | - * 6 | 64 | - * : : - * : : - * 33 | 496 | - * 34 | 512 | - * --------+------+ - * 35 | 1024 | - * 36 | 2048 | - * --------+------+ - */ - arena_bin_t bins[1]; /* Dynamically sized. */ -}; - -/******************************************************************************/ -/* - * Magazine data structures. - */ - -#ifdef MALLOC_MAG -typedef struct mag_s mag_t; -struct mag_s { - size_t binind; /* Index of associated bin. */ - size_t nrounds; - void *rounds[1]; /* Dynamically sized. */ -}; - -/* - * Magazines are lazily allocated, but once created, they remain until the - * associated mag_rack is destroyed. - */ -typedef struct bin_mags_s bin_mags_t; -struct bin_mags_s { - mag_t *curmag; - mag_t *sparemag; -}; - -typedef struct mag_rack_s mag_rack_t; -struct mag_rack_s { - bin_mags_t bin_mags[1]; /* Dynamically sized. */ -}; -#endif - -/******************************************************************************/ -/* - * Data. - */ - -/* Number of CPUs. */ -static unsigned ncpus; - -/* VM page size. */ -static size_t pagesize; -static size_t pagesize_mask; -static size_t pagesize_2pow; - -/* Various bin-related settings. */ -#ifdef MALLOC_TINY /* Number of (2^n)-spaced tiny bins. */ -# define ntbins ((unsigned)(QUANTUM_2POW - TINY_MIN_2POW)) -#else -# define ntbins 0 -#endif -static unsigned nqbins; /* Number of quantum-spaced bins. */ -static unsigned ncbins; /* Number of cacheline-spaced bins. */ -static unsigned nsbins; /* Number of subpage-spaced bins. */ -static unsigned nbins; -#ifdef MALLOC_TINY -# define tspace_max ((size_t)(QUANTUM >> 1)) -#endif -#define qspace_min QUANTUM -static size_t qspace_max; -static size_t cspace_min; -static size_t cspace_max; -static size_t sspace_min; -static size_t sspace_max; -#define bin_maxclass sspace_max - -static uint8_t const *size2bin; -/* - * const_size2bin is a static constant lookup table that in the common case can - * be used as-is for size2bin. For dynamically linked programs, this avoids - * a page of memory overhead per process. - */ -#define S2B_1(i) i, -#define S2B_2(i) S2B_1(i) S2B_1(i) -#define S2B_4(i) S2B_2(i) S2B_2(i) -#define S2B_8(i) S2B_4(i) S2B_4(i) -#define S2B_16(i) S2B_8(i) S2B_8(i) -#define S2B_32(i) S2B_16(i) S2B_16(i) -#define S2B_64(i) S2B_32(i) S2B_32(i) -#define S2B_128(i) S2B_64(i) S2B_64(i) -#define S2B_256(i) S2B_128(i) S2B_128(i) -static const uint8_t const_size2bin[(1U << PAGESIZE_2POW) - 255] = { - S2B_1(0xffU) /* 0 */ -#if (QUANTUM_2POW == 4) -/* 64-bit system ************************/ -# ifdef MALLOC_TINY - S2B_2(0) /* 2 */ - S2B_2(1) /* 4 */ - S2B_4(2) /* 8 */ - S2B_8(3) /* 16 */ -# define S2B_QMIN 3 -# else - S2B_16(0) /* 16 */ -# define S2B_QMIN 0 -# endif - S2B_16(S2B_QMIN + 1) /* 32 */ - S2B_16(S2B_QMIN + 2) /* 48 */ - S2B_16(S2B_QMIN + 3) /* 64 */ - S2B_16(S2B_QMIN + 4) /* 80 */ - S2B_16(S2B_QMIN + 5) /* 96 */ - S2B_16(S2B_QMIN + 6) /* 112 */ - S2B_16(S2B_QMIN + 7) /* 128 */ -# define S2B_CMIN (S2B_QMIN + 8) -#else -/* 32-bit system ************************/ -# ifdef MALLOC_TINY - S2B_2(0) /* 2 */ - S2B_2(1) /* 4 */ - S2B_4(2) /* 8 */ -# define S2B_QMIN 2 -# else - S2B_8(0) /* 8 */ -# define S2B_QMIN 0 -# endif - S2B_8(S2B_QMIN + 1) /* 16 */ - S2B_8(S2B_QMIN + 2) /* 24 */ - S2B_8(S2B_QMIN + 3) /* 32 */ - S2B_8(S2B_QMIN + 4) /* 40 */ - S2B_8(S2B_QMIN + 5) /* 48 */ - S2B_8(S2B_QMIN + 6) /* 56 */ - S2B_8(S2B_QMIN + 7) /* 64 */ - S2B_8(S2B_QMIN + 8) /* 72 */ - S2B_8(S2B_QMIN + 9) /* 80 */ - S2B_8(S2B_QMIN + 10) /* 88 */ - S2B_8(S2B_QMIN + 11) /* 96 */ - S2B_8(S2B_QMIN + 12) /* 104 */ - S2B_8(S2B_QMIN + 13) /* 112 */ - S2B_8(S2B_QMIN + 14) /* 120 */ - S2B_8(S2B_QMIN + 15) /* 128 */ -# define S2B_CMIN (S2B_QMIN + 16) -#endif -/****************************************/ - S2B_64(S2B_CMIN + 0) /* 192 */ - S2B_64(S2B_CMIN + 1) /* 256 */ - S2B_64(S2B_CMIN + 2) /* 320 */ - S2B_64(S2B_CMIN + 3) /* 384 */ - S2B_64(S2B_CMIN + 4) /* 448 */ - S2B_64(S2B_CMIN + 5) /* 512 */ -# define S2B_SMIN (S2B_CMIN + 6) - S2B_256(S2B_SMIN + 0) /* 768 */ - S2B_256(S2B_SMIN + 1) /* 1024 */ - S2B_256(S2B_SMIN + 2) /* 1280 */ - S2B_256(S2B_SMIN + 3) /* 1536 */ - S2B_256(S2B_SMIN + 4) /* 1792 */ - S2B_256(S2B_SMIN + 5) /* 2048 */ - S2B_256(S2B_SMIN + 6) /* 2304 */ - S2B_256(S2B_SMIN + 7) /* 2560 */ - S2B_256(S2B_SMIN + 8) /* 2816 */ - S2B_256(S2B_SMIN + 9) /* 3072 */ - S2B_256(S2B_SMIN + 10) /* 3328 */ - S2B_256(S2B_SMIN + 11) /* 3584 */ - S2B_256(S2B_SMIN + 12) /* 3840 */ -#if (PAGESIZE_2POW == 13) - S2B_256(S2B_SMIN + 13) /* 4096 */ - S2B_256(S2B_SMIN + 14) /* 4352 */ - S2B_256(S2B_SMIN + 15) /* 4608 */ - S2B_256(S2B_SMIN + 16) /* 4864 */ - S2B_256(S2B_SMIN + 17) /* 5120 */ - S2B_256(S2B_SMIN + 18) /* 5376 */ - S2B_256(S2B_SMIN + 19) /* 5632 */ - S2B_256(S2B_SMIN + 20) /* 5888 */ - S2B_256(S2B_SMIN + 21) /* 6144 */ - S2B_256(S2B_SMIN + 22) /* 6400 */ - S2B_256(S2B_SMIN + 23) /* 6656 */ - S2B_256(S2B_SMIN + 24) /* 6912 */ - S2B_256(S2B_SMIN + 25) /* 7168 */ - S2B_256(S2B_SMIN + 26) /* 7424 */ - S2B_256(S2B_SMIN + 27) /* 7680 */ - S2B_256(S2B_SMIN + 28) /* 7936 */ -#endif -}; -#undef S2B_1 -#undef S2B_2 -#undef S2B_4 -#undef S2B_8 -#undef S2B_16 -#undef S2B_32 -#undef S2B_64 -#undef S2B_128 -#undef S2B_256 -#undef S2B_QMIN -#undef S2B_CMIN -#undef S2B_SMIN - -#ifdef MALLOC_MAG -static size_t max_rounds; -#endif - -/* Various chunk-related settings. */ -static size_t chunksize; -static size_t chunksize_mask; /* (chunksize - 1). */ -static size_t chunk_npages; -static size_t arena_chunk_header_npages; -static size_t arena_maxclass; /* Max size class for arenas. */ - -/********/ -/* - * Chunks. - */ - -/* Protects chunk-related data structures. */ -static malloc_mutex_t huge_mtx; - -/* Tree of chunks that are stand-alone huge allocations. */ -static extent_tree_t huge; - -#ifdef MALLOC_DSS -/* - * Protects sbrk() calls. This avoids malloc races among threads, though it - * does not protect against races with threads that call sbrk() directly. - */ -static malloc_mutex_t dss_mtx; -/* Base address of the DSS. */ -static void *dss_base; -/* Current end of the DSS, or ((void *)-1) if the DSS is exhausted. */ -static void *dss_prev; -/* Current upper limit on DSS addresses. */ -static void *dss_max; - -/* - * Trees of chunks that were previously allocated (trees differ only in node - * ordering). These are used when allocating chunks, in an attempt to re-use - * address space. Depending on function, different tree orderings are needed, - * which is why there are two trees with the same contents. - */ -static extent_tree_t dss_chunks_szad; -static extent_tree_t dss_chunks_ad; -#endif - -#ifdef MALLOC_STATS -/* Huge allocation statistics. */ -static uint64_t huge_nmalloc; -static uint64_t huge_ndalloc; -static size_t huge_allocated; -#endif - -/****************************/ -/* - * base (internal allocation). - */ - -/* - * Current pages that are being used for internal memory allocations. These - * pages are carved up in cacheline-size quanta, so that there is no chance of - * false cache line sharing. - */ -static void *base_pages; -static void *base_next_addr; -static void *base_past_addr; /* Addr immediately past base_pages. */ -static extent_node_t *base_nodes; -static malloc_mutex_t base_mtx; -#ifdef MALLOC_STATS -static size_t base_mapped; -#endif - -/********/ -/* - * Arenas. - */ - -/* - * Arenas that are used to service external requests. Not all elements of the - * arenas array are necessarily used; arenas are created lazily as needed. - */ -static arena_t **arenas; -static unsigned narenas; -#ifndef NO_TLS -# ifdef MALLOC_BALANCE -static unsigned narenas_2pow; -# else -static unsigned next_arena; -# endif -#endif -static pthread_mutex_t arenas_lock; /* Protects arenas initialization. */ - -#ifndef NO_TLS -/* - * Map of pthread_self() --> arenas[???], used for selecting an arena to use - * for allocations. - */ -static __thread arena_t *arenas_map; -#endif - -#ifdef MALLOC_MAG -/* - * Map of thread-specific magazine racks, used for thread-specific object - * caching. - */ -static __thread mag_rack_t *mag_rack; -#endif - -#ifdef MALLOC_STATS -/* Chunk statistics. */ -static chunk_stats_t stats_chunks; -#endif - -/*******************************/ -/* - * Runtime configuration options. - */ -const char *_malloc_options; - -#ifndef MALLOC_PRODUCTION -static bool opt_abort = true; -static bool opt_junk = true; -#else -static bool opt_abort = false; -static bool opt_junk = false; -#endif -#ifdef MALLOC_DSS -static bool opt_dss = true; -static bool opt_mmap = true; -#endif -#ifdef MALLOC_MAG -static bool opt_mag = true; -static size_t opt_mag_size_2pow = MAG_SIZE_2POW_DEFAULT; -#endif -static size_t opt_dirty_max = DIRTY_MAX_DEFAULT; -#ifdef MALLOC_BALANCE -static uint64_t opt_balance_threshold = BALANCE_THRESHOLD_DEFAULT; -#endif -static bool opt_print_stats = false; -static size_t opt_qspace_max_2pow = QSPACE_MAX_2POW_DEFAULT; -static size_t opt_cspace_max_2pow = CSPACE_MAX_2POW_DEFAULT; -static size_t opt_chunk_2pow = CHUNK_2POW_DEFAULT; -static bool opt_utrace = false; -static bool opt_sysv = false; -static bool opt_xmalloc = false; -static bool opt_zero = false; -static int opt_narenas_lshift = 0; - -typedef struct { - void *p; - size_t s; - void *r; -} malloc_utrace_t; - -#define UTRACE(a, b, c) \ - if (opt_utrace) { \ - malloc_utrace_t ut; \ - ut.p = (a); \ - ut.s = (b); \ - ut.r = (c); \ - utrace(&ut, sizeof(ut)); \ - } - -/******************************************************************************/ -/* - * Begin function prototypes for non-inline static functions. - */ - -static void malloc_mutex_init(malloc_mutex_t *mutex); -static bool malloc_spin_init(pthread_mutex_t *lock); -static void wrtmessage(const char *p1, const char *p2, const char *p3, - const char *p4); -#ifdef MALLOC_STATS -static void malloc_printf(const char *format, ...); -#endif -static char *umax2s(uintmax_t x, char *s); -#ifdef MALLOC_DSS -static bool base_pages_alloc_dss(size_t minsize); -#endif -static bool base_pages_alloc_mmap(size_t minsize); -static bool base_pages_alloc(size_t minsize); -static void *base_alloc(size_t size); -static void *base_calloc(size_t number, size_t size); -static extent_node_t *base_node_alloc(void); -static void base_node_dealloc(extent_node_t *node); -#ifdef MALLOC_STATS -static void stats_print(arena_t *arena); -#endif -static void *pages_map(void *addr, size_t size); -static void pages_unmap(void *addr, size_t size); -#ifdef MALLOC_DSS -static void *chunk_alloc_dss(size_t size); -static void *chunk_recycle_dss(size_t size, bool zero); -#endif -static void *chunk_alloc_mmap(size_t size); -static void *chunk_alloc(size_t size, bool zero); -#ifdef MALLOC_DSS -static extent_node_t *chunk_dealloc_dss_record(void *chunk, size_t size); -static bool chunk_dealloc_dss(void *chunk, size_t size); -#endif -static void chunk_dealloc_mmap(void *chunk, size_t size); -static void chunk_dealloc(void *chunk, size_t size); -#ifndef NO_TLS -static arena_t *choose_arena_hard(void); -#endif -static void arena_run_split(arena_t *arena, arena_run_t *run, size_t size, - bool large, bool zero); -static arena_chunk_t *arena_chunk_alloc(arena_t *arena); -static void arena_chunk_dealloc(arena_t *arena, arena_chunk_t *chunk); -static arena_run_t *arena_run_alloc(arena_t *arena, size_t size, bool large, - bool zero); -static void arena_purge(arena_t *arena); -static void arena_run_dalloc(arena_t *arena, arena_run_t *run, bool dirty); -static void arena_run_trim_head(arena_t *arena, arena_chunk_t *chunk, - arena_run_t *run, size_t oldsize, size_t newsize); -static void arena_run_trim_tail(arena_t *arena, arena_chunk_t *chunk, - arena_run_t *run, size_t oldsize, size_t newsize, bool dirty); -static arena_run_t *arena_bin_nonfull_run_get(arena_t *arena, arena_bin_t *bin); -static void *arena_bin_malloc_hard(arena_t *arena, arena_bin_t *bin); -static size_t arena_bin_run_size_calc(arena_bin_t *bin, size_t min_run_size); -#ifdef MALLOC_BALANCE -static void arena_lock_balance_hard(arena_t *arena); -#endif -#ifdef MALLOC_MAG -static void mag_load(mag_t *mag); -#endif -static void *arena_malloc_large(arena_t *arena, size_t size, bool zero); -static void *arena_palloc(arena_t *arena, size_t alignment, size_t size, - size_t alloc_size); -static size_t arena_salloc(const void *ptr); -#ifdef MALLOC_MAG -static void mag_unload(mag_t *mag); -#endif -static void arena_dalloc_large(arena_t *arena, arena_chunk_t *chunk, - void *ptr); -static void arena_ralloc_large_shrink(arena_t *arena, arena_chunk_t *chunk, - void *ptr, size_t size, size_t oldsize); -static bool arena_ralloc_large_grow(arena_t *arena, arena_chunk_t *chunk, - void *ptr, size_t size, size_t oldsize); -static bool arena_ralloc_large(void *ptr, size_t size, size_t oldsize); -static void *arena_ralloc(void *ptr, size_t size, size_t oldsize); -static bool arena_new(arena_t *arena); -static arena_t *arenas_extend(unsigned ind); -#ifdef MALLOC_MAG -static mag_t *mag_create(arena_t *arena, size_t binind); -static void mag_destroy(mag_t *mag); -static mag_rack_t *mag_rack_create(arena_t *arena); -static void mag_rack_destroy(mag_rack_t *rack); -#endif -static void *huge_malloc(size_t size, bool zero); -static void *huge_palloc(size_t alignment, size_t size); -static void *huge_ralloc(void *ptr, size_t size, size_t oldsize); -static void huge_dalloc(void *ptr); -static void malloc_print_stats(void); -#ifdef MALLOC_DEBUG -static void size2bin_validate(void); -#endif -static bool size2bin_init(void); -static bool size2bin_init_hard(void); -static bool malloc_init_hard(void); - -/* - * End function prototypes. - */ -/******************************************************************************/ -/* - * Begin mutex. We can't use normal pthread mutexes in all places, because - * they require malloc()ed memory, which causes bootstrapping issues in some - * cases. - */ - -static void -malloc_mutex_init(malloc_mutex_t *mutex) -{ - static const spinlock_t lock = _SPINLOCK_INITIALIZER; - - mutex->lock = lock; -} - -static inline void -malloc_mutex_lock(malloc_mutex_t *mutex) -{ - - if (__isthreaded) - _SPINLOCK(&mutex->lock); -} - -static inline void -malloc_mutex_unlock(malloc_mutex_t *mutex) -{ - - if (__isthreaded) - _SPINUNLOCK(&mutex->lock); -} - -/* - * End mutex. - */ -/******************************************************************************/ -/* - * Begin spin lock. Spin locks here are actually adaptive mutexes that block - * after a period of spinning, because unbounded spinning would allow for - * priority inversion. - */ - -/* - * We use an unpublished interface to initialize pthread mutexes with an - * allocation callback, in order to avoid infinite recursion. - */ -int _pthread_mutex_init_calloc_cb(pthread_mutex_t *mutex, - void *(calloc_cb)(size_t, size_t)); - -__weak_reference(_pthread_mutex_init_calloc_cb_stub, - _pthread_mutex_init_calloc_cb); - -int -_pthread_mutex_init_calloc_cb_stub(pthread_mutex_t *mutex, - void *(calloc_cb)(size_t, size_t)) -{ - - return (0); -} - -static bool -malloc_spin_init(pthread_mutex_t *lock) -{ - - if (_pthread_mutex_init_calloc_cb(lock, base_calloc) != 0) - return (true); - - return (false); -} - -static inline unsigned -malloc_spin_lock(pthread_mutex_t *lock) -{ - unsigned ret = 0; - - if (__isthreaded) { - if (_pthread_mutex_trylock(lock) != 0) { - unsigned i; - volatile unsigned j; - - /* Exponentially back off. */ - for (i = 1; i <= SPIN_LIMIT_2POW; i++) { - for (j = 0; j < (1U << i); j++) { - ret++; - CPU_SPINWAIT; - } - - if (_pthread_mutex_trylock(lock) == 0) - return (ret); - } - - /* - * Spinning failed. Block until the lock becomes - * available, in order to avoid indefinite priority - * inversion. - */ - _pthread_mutex_lock(lock); - assert((ret << BLOCK_COST_2POW) != 0); - return (ret << BLOCK_COST_2POW); - } - } - - return (ret); -} - -static inline void -malloc_spin_unlock(pthread_mutex_t *lock) -{ - - if (__isthreaded) - _pthread_mutex_unlock(lock); -} - -/* - * End spin lock. - */ -/******************************************************************************/ -/* - * Begin Utility functions/macros. - */ - -/* Return the chunk address for allocation address a. */ -#define CHUNK_ADDR2BASE(a) \ - ((void *)((uintptr_t)(a) & ~chunksize_mask)) - -/* Return the chunk offset of address a. */ -#define CHUNK_ADDR2OFFSET(a) \ - ((size_t)((uintptr_t)(a) & chunksize_mask)) - -/* Return the smallest chunk multiple that is >= s. */ -#define CHUNK_CEILING(s) \ - (((s) + chunksize_mask) & ~chunksize_mask) - -/* Return the smallest quantum multiple that is >= a. */ -#define QUANTUM_CEILING(a) \ - (((a) + QUANTUM_MASK) & ~QUANTUM_MASK) - -/* Return the smallest cacheline multiple that is >= s. */ -#define CACHELINE_CEILING(s) \ - (((s) + CACHELINE_MASK) & ~CACHELINE_MASK) - -/* Return the smallest subpage multiple that is >= s. */ -#define SUBPAGE_CEILING(s) \ - (((s) + SUBPAGE_MASK) & ~SUBPAGE_MASK) - -/* Return the smallest pagesize multiple that is >= s. */ -#define PAGE_CEILING(s) \ - (((s) + pagesize_mask) & ~pagesize_mask) - -#ifdef MALLOC_TINY -/* Compute the smallest power of 2 that is >= x. */ -static inline size_t -pow2_ceil(size_t x) -{ - - x--; - x |= x >> 1; - x |= x >> 2; - x |= x >> 4; - x |= x >> 8; - x |= x >> 16; -#if (SIZEOF_PTR == 8) - x |= x >> 32; -#endif - x++; - return (x); -} -#endif - -#ifdef MALLOC_BALANCE -/* - * Use a simple linear congruential pseudo-random number generator: - * - * prn(y) = (a*x + c) % m - * - * where the following constants ensure maximal period: - * - * a == Odd number (relatively prime to 2^n), and (a-1) is a multiple of 4. - * c == Odd number (relatively prime to 2^n). - * m == 2^32 - * - * See Knuth's TAOCP 3rd Ed., Vol. 2, pg. 17 for details on these constraints. - * - * This choice of m has the disadvantage that the quality of the bits is - * proportional to bit position. For example. the lowest bit has a cycle of 2, - * the next has a cycle of 4, etc. For this reason, we prefer to use the upper - * bits. - */ -# define PRN_DEFINE(suffix, var, a, c) \ -static inline void \ -sprn_##suffix(uint32_t seed) \ -{ \ - var = seed; \ -} \ - \ -static inline uint32_t \ -prn_##suffix(uint32_t lg_range) \ -{ \ - uint32_t ret, x; \ - \ - assert(lg_range > 0); \ - assert(lg_range <= 32); \ - \ - x = (var * (a)) + (c); \ - var = x; \ - ret = x >> (32 - lg_range); \ - \ - return (ret); \ -} -# define SPRN(suffix, seed) sprn_##suffix(seed) -# define PRN(suffix, lg_range) prn_##suffix(lg_range) -#endif - -#ifdef MALLOC_BALANCE -/* Define the PRNG used for arena assignment. */ -static __thread uint32_t balance_x; -PRN_DEFINE(balance, balance_x, 1297, 1301) -#endif - -static void -wrtmessage(const char *p1, const char *p2, const char *p3, const char *p4) -{ - - _write(STDERR_FILENO, p1, strlen(p1)); - _write(STDERR_FILENO, p2, strlen(p2)); - _write(STDERR_FILENO, p3, strlen(p3)); - _write(STDERR_FILENO, p4, strlen(p4)); -} - -void (*_malloc_message)(const char *p1, const char *p2, const char *p3, - const char *p4) = wrtmessage; - -#ifdef MALLOC_STATS -/* - * Print to stderr in such a way as to (hopefully) avoid memory allocation. - */ -static void -malloc_printf(const char *format, ...) -{ - char buf[4096]; - va_list ap; - - va_start(ap, format); - vsnprintf(buf, sizeof(buf), format, ap); - va_end(ap); - _malloc_message(buf, "", "", ""); -} -#endif - -/* - * We don't want to depend on vsnprintf() for production builds, since that can - * cause unnecessary bloat for static binaries. umax2s() provides minimal - * integer printing functionality, so that malloc_printf() use can be limited to - * MALLOC_STATS code. - */ -#define UMAX2S_BUFSIZE 21 -static char * -umax2s(uintmax_t x, char *s) -{ - unsigned i; - - /* Make sure UMAX2S_BUFSIZE is large enough. */ - assert(sizeof(uintmax_t) <= 8); - - i = UMAX2S_BUFSIZE - 1; - s[i] = '\0'; - do { - i--; - s[i] = "0123456789"[x % 10]; - x /= 10; - } while (x > 0); - - return (&s[i]); -} - -/******************************************************************************/ - -#ifdef MALLOC_DSS -static bool -base_pages_alloc_dss(size_t minsize) -{ - - /* - * Do special DSS allocation here, since base allocations don't need to - * be chunk-aligned. - */ - malloc_mutex_lock(&dss_mtx); - if (dss_prev != (void *)-1) { - intptr_t incr; - size_t csize = CHUNK_CEILING(minsize); - - do { - /* Get the current end of the DSS. */ - dss_max = sbrk(0); - - /* - * Calculate how much padding is necessary to - * chunk-align the end of the DSS. Don't worry about - * dss_max not being chunk-aligned though. - */ - incr = (intptr_t)chunksize - - (intptr_t)CHUNK_ADDR2OFFSET(dss_max); - assert(incr >= 0); - if ((size_t)incr < minsize) - incr += csize; - - dss_prev = sbrk(incr); - if (dss_prev == dss_max) { - /* Success. */ - dss_max = (void *)((intptr_t)dss_prev + incr); - base_pages = dss_prev; - base_next_addr = base_pages; - base_past_addr = dss_max; -#ifdef MALLOC_STATS - base_mapped += incr; -#endif - malloc_mutex_unlock(&dss_mtx); - return (false); - } - } while (dss_prev != (void *)-1); - } - malloc_mutex_unlock(&dss_mtx); - - return (true); -} -#endif - -static bool -base_pages_alloc_mmap(size_t minsize) -{ - size_t csize; - - assert(minsize != 0); - csize = PAGE_CEILING(minsize); - base_pages = pages_map(NULL, csize); - if (base_pages == NULL) - return (true); - base_next_addr = base_pages; - base_past_addr = (void *)((uintptr_t)base_pages + csize); -#ifdef MALLOC_STATS - base_mapped += csize; -#endif - - return (false); -} - -static bool -base_pages_alloc(size_t minsize) -{ - -#ifdef MALLOC_DSS - if (opt_dss) { - if (base_pages_alloc_dss(minsize) == false) - return (false); - } - - if (opt_mmap && minsize != 0) -#endif - { - if (base_pages_alloc_mmap(minsize) == false) - return (false); - } - - return (true); -} - -static void * -base_alloc(size_t size) -{ - void *ret; - size_t csize; - - /* Round size up to nearest multiple of the cacheline size. */ - csize = CACHELINE_CEILING(size); - - malloc_mutex_lock(&base_mtx); - /* Make sure there's enough space for the allocation. */ - if ((uintptr_t)base_next_addr + csize > (uintptr_t)base_past_addr) { - if (base_pages_alloc(csize)) { - malloc_mutex_unlock(&base_mtx); - return (NULL); - } - } - /* Allocate. */ - ret = base_next_addr; - base_next_addr = (void *)((uintptr_t)base_next_addr + csize); - malloc_mutex_unlock(&base_mtx); - - return (ret); -} - -static void * -base_calloc(size_t number, size_t size) -{ - void *ret; - - ret = base_alloc(number * size); - memset(ret, 0, number * size); - - return (ret); -} - -static extent_node_t * -base_node_alloc(void) -{ - extent_node_t *ret; - - malloc_mutex_lock(&base_mtx); - if (base_nodes != NULL) { - ret = base_nodes; - base_nodes = *(extent_node_t **)ret; - malloc_mutex_unlock(&base_mtx); - } else { - malloc_mutex_unlock(&base_mtx); - ret = (extent_node_t *)base_alloc(sizeof(extent_node_t)); - } - - return (ret); -} - -static void -base_node_dealloc(extent_node_t *node) -{ - - malloc_mutex_lock(&base_mtx); - *(extent_node_t **)node = base_nodes; - base_nodes = node; - malloc_mutex_unlock(&base_mtx); -} - -/******************************************************************************/ - -#ifdef MALLOC_STATS -static void -stats_print(arena_t *arena) -{ - unsigned i, gap_start; - - malloc_printf("dirty: %zu page%s dirty, %llu sweep%s," - " %llu madvise%s, %llu page%s purged\n", - arena->ndirty, arena->ndirty == 1 ? "" : "s", - arena->stats.npurge, arena->stats.npurge == 1 ? "" : "s", - arena->stats.nmadvise, arena->stats.nmadvise == 1 ? "" : "s", - arena->stats.purged, arena->stats.purged == 1 ? "" : "s"); - - malloc_printf(" allocated nmalloc ndalloc\n"); - malloc_printf("small: %12zu %12llu %12llu\n", - arena->stats.allocated_small, arena->stats.nmalloc_small, - arena->stats.ndalloc_small); - malloc_printf("large: %12zu %12llu %12llu\n", - arena->stats.allocated_large, arena->stats.nmalloc_large, - arena->stats.ndalloc_large); - malloc_printf("total: %12zu %12llu %12llu\n", - arena->stats.allocated_small + arena->stats.allocated_large, - arena->stats.nmalloc_small + arena->stats.nmalloc_large, - arena->stats.ndalloc_small + arena->stats.ndalloc_large); - malloc_printf("mapped: %12zu\n", arena->stats.mapped); - -#ifdef MALLOC_MAG - if (__isthreaded && opt_mag) { - malloc_printf("bins: bin size regs pgs mags " - "newruns reruns maxruns curruns\n"); - } else { -#endif - malloc_printf("bins: bin size regs pgs requests " - "newruns reruns maxruns curruns\n"); -#ifdef MALLOC_MAG - } -#endif - for (i = 0, gap_start = UINT_MAX; i < nbins; i++) { - if (arena->bins[i].stats.nruns == 0) { - if (gap_start == UINT_MAX) - gap_start = i; - } else { - if (gap_start != UINT_MAX) { - if (i > gap_start + 1) { - /* Gap of more than one size class. */ - malloc_printf("[%u..%u]\n", - gap_start, i - 1); - } else { - /* Gap of one size class. */ - malloc_printf("[%u]\n", gap_start); - } - gap_start = UINT_MAX; - } - malloc_printf( - "%13u %1s %4u %4u %3u %9llu %9llu" - " %9llu %7lu %7lu\n", - i, - i < ntbins ? "T" : i < ntbins + nqbins ? "Q" : - i < ntbins + nqbins + ncbins ? "C" : "S", - arena->bins[i].reg_size, - arena->bins[i].nregs, - arena->bins[i].run_size >> pagesize_2pow, -#ifdef MALLOC_MAG - (__isthreaded && opt_mag) ? - arena->bins[i].stats.nmags : -#endif - arena->bins[i].stats.nrequests, - arena->bins[i].stats.nruns, - arena->bins[i].stats.reruns, - arena->bins[i].stats.highruns, - arena->bins[i].stats.curruns); - } - } - if (gap_start != UINT_MAX) { - if (i > gap_start + 1) { - /* Gap of more than one size class. */ - malloc_printf("[%u..%u]\n", gap_start, i - 1); - } else { - /* Gap of one size class. */ - malloc_printf("[%u]\n", gap_start); - } - } -} -#endif - -/* - * End Utility functions/macros. - */ -/******************************************************************************/ -/* - * Begin extent tree code. - */ - -#ifdef MALLOC_DSS -static inline int -extent_szad_comp(extent_node_t *a, extent_node_t *b) -{ - int ret; - size_t a_size = a->size; - size_t b_size = b->size; - - ret = (a_size > b_size) - (a_size < b_size); - if (ret == 0) { - uintptr_t a_addr = (uintptr_t)a->addr; - uintptr_t b_addr = (uintptr_t)b->addr; - - ret = (a_addr > b_addr) - (a_addr < b_addr); - } - - return (ret); -} - -/* Wrap red-black tree macros in functions. */ -rb_wrap(__unused static, extent_tree_szad_, extent_tree_t, extent_node_t, - link_szad, extent_szad_comp) -#endif - -static inline int -extent_ad_comp(extent_node_t *a, extent_node_t *b) -{ - uintptr_t a_addr = (uintptr_t)a->addr; - uintptr_t b_addr = (uintptr_t)b->addr; - - return ((a_addr > b_addr) - (a_addr < b_addr)); -} - -/* Wrap red-black tree macros in functions. */ -rb_wrap(__unused static, extent_tree_ad_, extent_tree_t, extent_node_t, link_ad, - extent_ad_comp) - -/* - * End extent tree code. - */ -/******************************************************************************/ -/* - * Begin chunk management functions. - */ - -static void * -pages_map(void *addr, size_t size) -{ - void *ret; - - /* - * We don't use MAP_FIXED here, because it can cause the *replacement* - * of existing mappings, and we only want to create new mappings. - */ - ret = mmap(addr, size, PROT_READ | PROT_WRITE, MAP_PRIVATE | MAP_ANON, - -1, 0); - assert(ret != NULL); - - if (ret == MAP_FAILED) - ret = NULL; - else if (addr != NULL && ret != addr) { - /* - * We succeeded in mapping memory, but not in the right place. - */ - if (munmap(ret, size) == -1) { - char buf[STRERROR_BUF]; - - strerror_r(errno, buf, sizeof(buf)); - _malloc_message(_getprogname(), - ": (malloc) Error in munmap(): ", buf, "\n"); - if (opt_abort) - abort(); - } - ret = NULL; - } - - assert(ret == NULL || (addr == NULL && ret != addr) - || (addr != NULL && ret == addr)); - return (ret); -} - -static void -pages_unmap(void *addr, size_t size) -{ - - if (munmap(addr, size) == -1) { - char buf[STRERROR_BUF]; - - strerror_r(errno, buf, sizeof(buf)); - _malloc_message(_getprogname(), - ": (malloc) Error in munmap(): ", buf, "\n"); - if (opt_abort) - abort(); - } -} - -#ifdef MALLOC_DSS -static void * -chunk_alloc_dss(size_t size) -{ - - /* - * sbrk() uses a signed increment argument, so take care not to - * interpret a huge allocation request as a negative increment. - */ - if ((intptr_t)size < 0) - return (NULL); - - malloc_mutex_lock(&dss_mtx); - if (dss_prev != (void *)-1) { - intptr_t incr; - - /* - * The loop is necessary to recover from races with other - * threads that are using the DSS for something other than - * malloc. - */ - do { - void *ret; - - /* Get the current end of the DSS. */ - dss_max = sbrk(0); - - /* - * Calculate how much padding is necessary to - * chunk-align the end of the DSS. - */ - incr = (intptr_t)size - - (intptr_t)CHUNK_ADDR2OFFSET(dss_max); - if (incr == (intptr_t)size) - ret = dss_max; - else { - ret = (void *)((intptr_t)dss_max + incr); - incr += size; - } - - dss_prev = sbrk(incr); - if (dss_prev == dss_max) { - /* Success. */ - dss_max = (void *)((intptr_t)dss_prev + incr); - malloc_mutex_unlock(&dss_mtx); - return (ret); - } - } while (dss_prev != (void *)-1); - } - malloc_mutex_unlock(&dss_mtx); - - return (NULL); -} - -static void * -chunk_recycle_dss(size_t size, bool zero) -{ - extent_node_t *node, key; - - key.addr = NULL; - key.size = size; - malloc_mutex_lock(&dss_mtx); - node = extent_tree_szad_nsearch(&dss_chunks_szad, &key); - if (node != NULL) { - void *ret = node->addr; - - /* Remove node from the tree. */ - extent_tree_szad_remove(&dss_chunks_szad, node); - if (node->size == size) { - extent_tree_ad_remove(&dss_chunks_ad, node); - base_node_dealloc(node); - } else { - /* - * Insert the remainder of node's address range as a - * smaller chunk. Its position within dss_chunks_ad - * does not change. - */ - assert(node->size > size); - node->addr = (void *)((uintptr_t)node->addr + size); - node->size -= size; - extent_tree_szad_insert(&dss_chunks_szad, node); - } - malloc_mutex_unlock(&dss_mtx); - - if (zero) - memset(ret, 0, size); - return (ret); - } - malloc_mutex_unlock(&dss_mtx); - - return (NULL); -} -#endif - -static void * -chunk_alloc_mmap(size_t size) -{ - void *ret; - size_t offset; - - /* - * Ideally, there would be a way to specify alignment to mmap() (like - * NetBSD has), but in the absence of such a feature, we have to work - * hard to efficiently create aligned mappings. The reliable, but - * expensive method is to create a mapping that is over-sized, then - * trim the excess. However, that always results in at least one call - * to pages_unmap(). - * - * A more optimistic approach is to try mapping precisely the right - * amount, then try to append another mapping if alignment is off. In - * practice, this works out well as long as the application is not - * interleaving mappings via direct mmap() calls. If we do run into a - * situation where there is an interleaved mapping and we are unable to - * extend an unaligned mapping, our best option is to momentarily - * revert to the reliable-but-expensive method. This will tend to - * leave a gap in the memory map that is too small to cause later - * problems for the optimistic method. - */ - - ret = pages_map(NULL, size); - if (ret == NULL) - return (NULL); - - offset = CHUNK_ADDR2OFFSET(ret); - if (offset != 0) { - /* Try to extend chunk boundary. */ - if (pages_map((void *)((uintptr_t)ret + size), - chunksize - offset) == NULL) { - /* - * Extension failed. Clean up, then revert to the - * reliable-but-expensive method. - */ - pages_unmap(ret, size); - - /* Beware size_t wrap-around. */ - if (size + chunksize <= size) - return NULL; - - ret = pages_map(NULL, size + chunksize); - if (ret == NULL) - return (NULL); - - /* Clean up unneeded leading/trailing space. */ - offset = CHUNK_ADDR2OFFSET(ret); - if (offset != 0) { - /* Leading space. */ - pages_unmap(ret, chunksize - offset); - - ret = (void *)((uintptr_t)ret + - (chunksize - offset)); - - /* Trailing space. */ - pages_unmap((void *)((uintptr_t)ret + size), - offset); - } else { - /* Trailing space only. */ - pages_unmap((void *)((uintptr_t)ret + size), - chunksize); - } - } else { - /* Clean up unneeded leading space. */ - pages_unmap(ret, chunksize - offset); - ret = (void *)((uintptr_t)ret + (chunksize - offset)); - } - } - - return (ret); -} - -static void * -chunk_alloc(size_t size, bool zero) -{ - void *ret; - - assert(size != 0); - assert((size & chunksize_mask) == 0); - -#ifdef MALLOC_DSS - if (opt_dss) { - ret = chunk_recycle_dss(size, zero); - if (ret != NULL) { - goto RETURN; - } - - ret = chunk_alloc_dss(size); - if (ret != NULL) - goto RETURN; - } - - if (opt_mmap) -#endif - { - ret = chunk_alloc_mmap(size); - if (ret != NULL) - goto RETURN; - } - - /* All strategies for allocation failed. */ - ret = NULL; -RETURN: -#ifdef MALLOC_STATS - if (ret != NULL) { - stats_chunks.nchunks += (size / chunksize); - stats_chunks.curchunks += (size / chunksize); - } - if (stats_chunks.curchunks > stats_chunks.highchunks) - stats_chunks.highchunks = stats_chunks.curchunks; -#endif - - assert(CHUNK_ADDR2BASE(ret) == ret); - return (ret); -} - -#ifdef MALLOC_DSS -static extent_node_t * -chunk_dealloc_dss_record(void *chunk, size_t size) -{ - extent_node_t *node, *prev, key; - - key.addr = (void *)((uintptr_t)chunk + size); - node = extent_tree_ad_nsearch(&dss_chunks_ad, &key); - /* Try to coalesce forward. */ - if (node != NULL && node->addr == key.addr) { - /* - * Coalesce chunk with the following address range. This does - * not change the position within dss_chunks_ad, so only - * remove/insert from/into dss_chunks_szad. - */ - extent_tree_szad_remove(&dss_chunks_szad, node); - node->addr = chunk; - node->size += size; - extent_tree_szad_insert(&dss_chunks_szad, node); - } else { - /* - * Coalescing forward failed, so insert a new node. Drop - * dss_mtx during node allocation, since it is possible that a - * new base chunk will be allocated. - */ - malloc_mutex_unlock(&dss_mtx); - node = base_node_alloc(); - malloc_mutex_lock(&dss_mtx); - if (node == NULL) - return (NULL); - node->addr = chunk; - node->size = size; - extent_tree_ad_insert(&dss_chunks_ad, node); - extent_tree_szad_insert(&dss_chunks_szad, node); - } - - /* Try to coalesce backward. */ - prev = extent_tree_ad_prev(&dss_chunks_ad, node); - if (prev != NULL && (void *)((uintptr_t)prev->addr + prev->size) == - chunk) { - /* - * Coalesce chunk with the previous address range. This does - * not change the position within dss_chunks_ad, so only - * remove/insert node from/into dss_chunks_szad. - */ - extent_tree_szad_remove(&dss_chunks_szad, prev); - extent_tree_ad_remove(&dss_chunks_ad, prev); - - extent_tree_szad_remove(&dss_chunks_szad, node); - node->addr = prev->addr; - node->size += prev->size; - extent_tree_szad_insert(&dss_chunks_szad, node); - - base_node_dealloc(prev); - } - - return (node); -} - -static bool -chunk_dealloc_dss(void *chunk, size_t size) -{ - - malloc_mutex_lock(&dss_mtx); - if ((uintptr_t)chunk >= (uintptr_t)dss_base - && (uintptr_t)chunk < (uintptr_t)dss_max) { - extent_node_t *node; - - /* Try to coalesce with other unused chunks. */ - node = chunk_dealloc_dss_record(chunk, size); - if (node != NULL) { - chunk = node->addr; - size = node->size; - } - - /* Get the current end of the DSS. */ - dss_max = sbrk(0); - - /* - * Try to shrink the DSS if this chunk is at the end of the - * DSS. The sbrk() call here is subject to a race condition - * with threads that use brk(2) or sbrk(2) directly, but the - * alternative would be to leak memory for the sake of poorly - * designed multi-threaded programs. - */ - if ((void *)((uintptr_t)chunk + size) == dss_max - && (dss_prev = sbrk(-(intptr_t)size)) == dss_max) { - /* Success. */ - dss_max = (void *)((intptr_t)dss_prev - (intptr_t)size); - - if (node != NULL) { - extent_tree_szad_remove(&dss_chunks_szad, node); - extent_tree_ad_remove(&dss_chunks_ad, node); - base_node_dealloc(node); - } - malloc_mutex_unlock(&dss_mtx); - } else { - malloc_mutex_unlock(&dss_mtx); - madvise(chunk, size, MADV_FREE); - } - - return (false); - } - malloc_mutex_unlock(&dss_mtx); - - return (true); -} -#endif - -static void -chunk_dealloc_mmap(void *chunk, size_t size) -{ - - pages_unmap(chunk, size); -} - -static void -chunk_dealloc(void *chunk, size_t size) -{ - - assert(chunk != NULL); - assert(CHUNK_ADDR2BASE(chunk) == chunk); - assert(size != 0); - assert((size & chunksize_mask) == 0); - -#ifdef MALLOC_STATS - stats_chunks.curchunks -= (size / chunksize); -#endif - -#ifdef MALLOC_DSS - if (opt_dss) { - if (chunk_dealloc_dss(chunk, size) == false) - return; - } - - if (opt_mmap) -#endif - chunk_dealloc_mmap(chunk, size); -} - -/* - * End chunk management functions. - */ -/******************************************************************************/ -/* - * Begin arena. - */ - -/* - * Choose an arena based on a per-thread value (fast-path code, calls slow-path - * code if necessary). - */ -static inline arena_t * -choose_arena(void) -{ - arena_t *ret; - - /* - * We can only use TLS if this is a PIC library, since for the static - * library version, libc's malloc is used by TLS allocation, which - * introduces a bootstrapping issue. - */ -#ifndef NO_TLS - if (__isthreaded == false) { - /* Avoid the overhead of TLS for single-threaded operation. */ - return (arenas[0]); - } - - ret = arenas_map; - if (ret == NULL) { - ret = choose_arena_hard(); - assert(ret != NULL); - } -#else - if (__isthreaded && narenas > 1) { - unsigned long ind; - - /* - * Hash _pthread_self() to one of the arenas. There is a prime - * number of arenas, so this has a reasonable chance of - * working. Even so, the hashing can be easily thwarted by - * inconvenient _pthread_self() values. Without specific - * knowledge of how _pthread_self() calculates values, we can't - * easily do much better than this. - */ - ind = (unsigned long) _pthread_self() % narenas; - - /* - * Optimistially assume that arenas[ind] has been initialized. - * At worst, we find out that some other thread has already - * done so, after acquiring the lock in preparation. Note that - * this lazy locking also has the effect of lazily forcing - * cache coherency; without the lock acquisition, there's no - * guarantee that modification of arenas[ind] by another thread - * would be seen on this CPU for an arbitrary amount of time. - * - * In general, this approach to modifying a synchronized value - * isn't a good idea, but in this case we only ever modify the - * value once, so things work out well. - */ - ret = arenas[ind]; - if (ret == NULL) { - /* - * Avoid races with another thread that may have already - * initialized arenas[ind]. - */ - malloc_spin_lock(&arenas_lock); - if (arenas[ind] == NULL) - ret = arenas_extend((unsigned)ind); - else - ret = arenas[ind]; - malloc_spin_unlock(&arenas_lock); - } - } else - ret = arenas[0]; -#endif - - assert(ret != NULL); - return (ret); -} - -#ifndef NO_TLS -/* - * Choose an arena based on a per-thread value (slow-path code only, called - * only by choose_arena()). - */ -static arena_t * -choose_arena_hard(void) -{ - arena_t *ret; - - assert(__isthreaded); - -#ifdef MALLOC_BALANCE - /* Seed the PRNG used for arena load balancing. */ - SPRN(balance, (uint32_t)(uintptr_t)(_pthread_self())); -#endif - - if (narenas > 1) { -#ifdef MALLOC_BALANCE - unsigned ind; - - ind = PRN(balance, narenas_2pow); - if ((ret = arenas[ind]) == NULL) { - malloc_spin_lock(&arenas_lock); - if ((ret = arenas[ind]) == NULL) - ret = arenas_extend(ind); - malloc_spin_unlock(&arenas_lock); - } -#else - malloc_spin_lock(&arenas_lock); - if ((ret = arenas[next_arena]) == NULL) - ret = arenas_extend(next_arena); - next_arena = (next_arena + 1) % narenas; - malloc_spin_unlock(&arenas_lock); -#endif - } else - ret = arenas[0]; - - arenas_map = ret; - - return (ret); -} -#endif - -static inline int -arena_chunk_comp(arena_chunk_t *a, arena_chunk_t *b) -{ - uintptr_t a_chunk = (uintptr_t)a; - uintptr_t b_chunk = (uintptr_t)b; - - assert(a != NULL); - assert(b != NULL); - - return ((a_chunk > b_chunk) - (a_chunk < b_chunk)); -} - -/* Wrap red-black tree macros in functions. */ -rb_wrap(__unused static, arena_chunk_tree_dirty_, arena_chunk_tree_t, - arena_chunk_t, link_dirty, arena_chunk_comp) - -static inline int -arena_run_comp(arena_chunk_map_t *a, arena_chunk_map_t *b) -{ - uintptr_t a_mapelm = (uintptr_t)a; - uintptr_t b_mapelm = (uintptr_t)b; - - assert(a != NULL); - assert(b != NULL); - - return ((a_mapelm > b_mapelm) - (a_mapelm < b_mapelm)); -} - -/* Wrap red-black tree macros in functions. */ -rb_wrap(__unused static, arena_run_tree_, arena_run_tree_t, arena_chunk_map_t, - link, arena_run_comp) - -static inline int -arena_avail_comp(arena_chunk_map_t *a, arena_chunk_map_t *b) -{ - int ret; - size_t a_size = a->bits & ~pagesize_mask; - size_t b_size = b->bits & ~pagesize_mask; - - ret = (a_size > b_size) - (a_size < b_size); - if (ret == 0) { - uintptr_t a_mapelm, b_mapelm; - - if ((a->bits & CHUNK_MAP_KEY) == 0) - a_mapelm = (uintptr_t)a; - else { - /* - * Treat keys as though they are lower than anything - * else. - */ - a_mapelm = 0; - } - b_mapelm = (uintptr_t)b; - - ret = (a_mapelm > b_mapelm) - (a_mapelm < b_mapelm); - } - - return (ret); -} - -/* Wrap red-black tree macros in functions. */ -rb_wrap(__unused static, arena_avail_tree_, arena_avail_tree_t, - arena_chunk_map_t, link, arena_avail_comp) - -static inline void * -arena_run_reg_alloc(arena_run_t *run, arena_bin_t *bin) -{ - void *ret; - unsigned i, mask, bit, regind; - - assert(run->magic == ARENA_RUN_MAGIC); - assert(run->regs_minelm < bin->regs_mask_nelms); - - /* - * Move the first check outside the loop, so that run->regs_minelm can - * be updated unconditionally, without the possibility of updating it - * multiple times. - */ - i = run->regs_minelm; - mask = run->regs_mask[i]; - if (mask != 0) { - /* Usable allocation found. */ - bit = ffs((int)mask) - 1; - - regind = ((i << (SIZEOF_INT_2POW + 3)) + bit); - assert(regind < bin->nregs); - ret = (void *)(((uintptr_t)run) + bin->reg0_offset - + (bin->reg_size * regind)); - - /* Clear bit. */ - mask ^= (1U << bit); - run->regs_mask[i] = mask; - - return (ret); - } - - for (i++; i < bin->regs_mask_nelms; i++) { - mask = run->regs_mask[i]; - if (mask != 0) { - /* Usable allocation found. */ - bit = ffs((int)mask) - 1; - - regind = ((i << (SIZEOF_INT_2POW + 3)) + bit); - assert(regind < bin->nregs); - ret = (void *)(((uintptr_t)run) + bin->reg0_offset - + (bin->reg_size * regind)); - - /* Clear bit. */ - mask ^= (1U << bit); - run->regs_mask[i] = mask; - - /* - * Make a note that nothing before this element - * contains a free region. - */ - run->regs_minelm = i; /* Low payoff: + (mask == 0); */ - - return (ret); - } - } - /* Not reached. */ - assert(0); - return (NULL); -} - -static inline void -arena_run_reg_dalloc(arena_run_t *run, arena_bin_t *bin, void *ptr, size_t size) -{ - unsigned diff, regind, elm, bit; - - assert(run->magic == ARENA_RUN_MAGIC); - - /* - * Avoid doing division with a variable divisor if possible. Using - * actual division here can reduce allocator throughput by over 20%! - */ - diff = (unsigned)((uintptr_t)ptr - (uintptr_t)run - bin->reg0_offset); - if ((size & (size - 1)) == 0) { - /* - * log2_table allows fast division of a power of two in the - * [1..128] range. - * - * (x / divisor) becomes (x >> log2_table[divisor - 1]). - */ - static const unsigned char log2_table[] = { - 0, 1, 0, 2, 0, 0, 0, 3, 0, 0, 0, 0, 0, 0, 0, 4, - 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 5, - 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, - 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 6, - 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, - 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, - 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, - 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 7 - }; - - if (size <= 128) - regind = (diff >> log2_table[size - 1]); - else if (size <= 32768) - regind = diff >> (8 + log2_table[(size >> 8) - 1]); - else - regind = diff / size; - } else if (size < qspace_max) { - /* - * To divide by a number D that is not a power of two we - * multiply by (2^21 / D) and then right shift by 21 positions. - * - * X / D - * - * becomes - * - * (X * qsize_invs[(D >> QUANTUM_2POW) - 3]) - * >> SIZE_INV_SHIFT - * - * We can omit the first three elements, because we never - * divide by 0, and QUANTUM and 2*QUANTUM are both powers of - * two, which are handled above. - */ -#define SIZE_INV_SHIFT 21 -#define QSIZE_INV(s) (((1U << SIZE_INV_SHIFT) / (s << QUANTUM_2POW)) + 1) - static const unsigned qsize_invs[] = { - QSIZE_INV(3), - QSIZE_INV(4), QSIZE_INV(5), QSIZE_INV(6), QSIZE_INV(7) -#if (QUANTUM_2POW < 4) - , - QSIZE_INV(8), QSIZE_INV(9), QSIZE_INV(10), QSIZE_INV(11), - QSIZE_INV(12),QSIZE_INV(13), QSIZE_INV(14), QSIZE_INV(15) -#endif - }; - assert(QUANTUM * (((sizeof(qsize_invs)) / sizeof(unsigned)) + 3) - >= (1U << QSPACE_MAX_2POW_DEFAULT)); - - if (size <= (((sizeof(qsize_invs) / sizeof(unsigned)) + 2) << - QUANTUM_2POW)) { - regind = qsize_invs[(size >> QUANTUM_2POW) - 3] * diff; - regind >>= SIZE_INV_SHIFT; - } else - regind = diff / size; -#undef QSIZE_INV - } else if (size < cspace_max) { -#define CSIZE_INV(s) (((1U << SIZE_INV_SHIFT) / (s << CACHELINE_2POW)) + 1) - static const unsigned csize_invs[] = { - CSIZE_INV(3), - CSIZE_INV(4), CSIZE_INV(5), CSIZE_INV(6), CSIZE_INV(7) - }; - assert(CACHELINE * (((sizeof(csize_invs)) / sizeof(unsigned)) + - 3) >= (1U << CSPACE_MAX_2POW_DEFAULT)); - - if (size <= (((sizeof(csize_invs) / sizeof(unsigned)) + 2) << - CACHELINE_2POW)) { - regind = csize_invs[(size >> CACHELINE_2POW) - 3] * - diff; - regind >>= SIZE_INV_SHIFT; - } else - regind = diff / size; -#undef CSIZE_INV - } else { -#define SSIZE_INV(s) (((1U << SIZE_INV_SHIFT) / (s << SUBPAGE_2POW)) + 1) - static const unsigned ssize_invs[] = { - SSIZE_INV(3), - SSIZE_INV(4), SSIZE_INV(5), SSIZE_INV(6), SSIZE_INV(7), - SSIZE_INV(8), SSIZE_INV(9), SSIZE_INV(10), SSIZE_INV(11), - SSIZE_INV(12), SSIZE_INV(13), SSIZE_INV(14), SSIZE_INV(15) -#if (PAGESIZE_2POW == 13) - , - SSIZE_INV(16), SSIZE_INV(17), SSIZE_INV(18), SSIZE_INV(19), - SSIZE_INV(20), SSIZE_INV(21), SSIZE_INV(22), SSIZE_INV(23), - SSIZE_INV(24), SSIZE_INV(25), SSIZE_INV(26), SSIZE_INV(27), - SSIZE_INV(28), SSIZE_INV(29), SSIZE_INV(29), SSIZE_INV(30) -#endif - }; - assert(SUBPAGE * (((sizeof(ssize_invs)) / sizeof(unsigned)) + 3) - >= (1U << PAGESIZE_2POW)); - - if (size < (((sizeof(ssize_invs) / sizeof(unsigned)) + 2) << - SUBPAGE_2POW)) { - regind = ssize_invs[(size >> SUBPAGE_2POW) - 3] * diff; - regind >>= SIZE_INV_SHIFT; - } else - regind = diff / size; -#undef SSIZE_INV - } -#undef SIZE_INV_SHIFT - assert(diff == regind * size); - assert(regind < bin->nregs); - - elm = regind >> (SIZEOF_INT_2POW + 3); - if (elm < run->regs_minelm) - run->regs_minelm = elm; - bit = regind - (elm << (SIZEOF_INT_2POW + 3)); - assert((run->regs_mask[elm] & (1U << bit)) == 0); - run->regs_mask[elm] |= (1U << bit); -} - -static void -arena_run_split(arena_t *arena, arena_run_t *run, size_t size, bool large, - bool zero) -{ - arena_chunk_t *chunk; - size_t old_ndirty, run_ind, total_pages, need_pages, rem_pages, i; - - chunk = (arena_chunk_t *)CHUNK_ADDR2BASE(run); - old_ndirty = chunk->ndirty; - run_ind = (unsigned)(((uintptr_t)run - (uintptr_t)chunk) - >> pagesize_2pow); - total_pages = (chunk->map[run_ind].bits & ~pagesize_mask) >> - pagesize_2pow; - need_pages = (size >> pagesize_2pow); - assert(need_pages > 0); - assert(need_pages <= total_pages); - rem_pages = total_pages - need_pages; - - arena_avail_tree_remove(&arena->runs_avail, &chunk->map[run_ind]); - - /* Keep track of trailing unused pages for later use. */ - if (rem_pages > 0) { - chunk->map[run_ind+need_pages].bits = (rem_pages << - pagesize_2pow) | (chunk->map[run_ind+need_pages].bits & - pagesize_mask); - chunk->map[run_ind+total_pages-1].bits = (rem_pages << - pagesize_2pow) | (chunk->map[run_ind+total_pages-1].bits & - pagesize_mask); - arena_avail_tree_insert(&arena->runs_avail, - &chunk->map[run_ind+need_pages]); - } - - for (i = 0; i < need_pages; i++) { - /* Zero if necessary. */ - if (zero) { - if ((chunk->map[run_ind + i].bits & CHUNK_MAP_ZEROED) - == 0) { - memset((void *)((uintptr_t)chunk + ((run_ind - + i) << pagesize_2pow)), 0, pagesize); - /* CHUNK_MAP_ZEROED is cleared below. */ - } - } - - /* Update dirty page accounting. */ - if (chunk->map[run_ind + i].bits & CHUNK_MAP_DIRTY) { - chunk->ndirty--; - arena->ndirty--; - /* CHUNK_MAP_DIRTY is cleared below. */ - } - - /* Initialize the chunk map. */ - if (large) { - chunk->map[run_ind + i].bits = CHUNK_MAP_LARGE - | CHUNK_MAP_ALLOCATED; - } else { - chunk->map[run_ind + i].bits = (size_t)run - | CHUNK_MAP_ALLOCATED; - } - } - - /* - * Set the run size only in the first element for large runs. This is - * primarily a debugging aid, since the lack of size info for trailing - * pages only matters if the application tries to operate on an - * interior pointer. - */ - if (large) - chunk->map[run_ind].bits |= size; - - if (chunk->ndirty == 0 && old_ndirty > 0) - arena_chunk_tree_dirty_remove(&arena->chunks_dirty, chunk); -} - -static arena_chunk_t * -arena_chunk_alloc(arena_t *arena) -{ - arena_chunk_t *chunk; - size_t i; - - if (arena->spare != NULL) { - chunk = arena->spare; - arena->spare = NULL; - } else { - chunk = (arena_chunk_t *)chunk_alloc(chunksize, true); - if (chunk == NULL) - return (NULL); -#ifdef MALLOC_STATS - arena->stats.mapped += chunksize; -#endif - - chunk->arena = arena; - - /* - * Claim that no pages are in use, since the header is merely - * overhead. - */ - chunk->ndirty = 0; - - /* - * Initialize the map to contain one maximal free untouched run. - */ - for (i = 0; i < arena_chunk_header_npages; i++) - chunk->map[i].bits = 0; - chunk->map[i].bits = arena_maxclass | CHUNK_MAP_ZEROED; - for (i++; i < chunk_npages-1; i++) { - chunk->map[i].bits = CHUNK_MAP_ZEROED; - } - chunk->map[chunk_npages-1].bits = arena_maxclass | - CHUNK_MAP_ZEROED; - } - - /* Insert the run into the runs_avail tree. */ - arena_avail_tree_insert(&arena->runs_avail, - &chunk->map[arena_chunk_header_npages]); - - return (chunk); -} - -static void -arena_chunk_dealloc(arena_t *arena, arena_chunk_t *chunk) -{ - - if (arena->spare != NULL) { - if (arena->spare->ndirty > 0) { - arena_chunk_tree_dirty_remove( - &chunk->arena->chunks_dirty, arena->spare); - arena->ndirty -= arena->spare->ndirty; - } - chunk_dealloc((void *)arena->spare, chunksize); -#ifdef MALLOC_STATS - arena->stats.mapped -= chunksize; -#endif - } - - /* - * Remove run from runs_avail, regardless of whether this chunk - * will be cached, so that the arena does not use it. Dirty page - * flushing only uses the chunks_dirty tree, so leaving this chunk in - * the chunks_* trees is sufficient for that purpose. - */ - arena_avail_tree_remove(&arena->runs_avail, - &chunk->map[arena_chunk_header_npages]); - - arena->spare = chunk; -} - -static arena_run_t * -arena_run_alloc(arena_t *arena, size_t size, bool large, bool zero) -{ - arena_chunk_t *chunk; - arena_run_t *run; - arena_chunk_map_t *mapelm, key; - - assert(size <= arena_maxclass); - assert((size & pagesize_mask) == 0); - - /* Search the arena's chunks for the lowest best fit. */ - key.bits = size | CHUNK_MAP_KEY; - mapelm = arena_avail_tree_nsearch(&arena->runs_avail, &key); - if (mapelm != NULL) { - arena_chunk_t *run_chunk = CHUNK_ADDR2BASE(mapelm); - size_t pageind = ((uintptr_t)mapelm - (uintptr_t)run_chunk->map) - / sizeof(arena_chunk_map_t); - - run = (arena_run_t *)((uintptr_t)run_chunk + (pageind - << pagesize_2pow)); - arena_run_split(arena, run, size, large, zero); - return (run); - } - - /* - * No usable runs. Create a new chunk from which to allocate the run. - */ - chunk = arena_chunk_alloc(arena); - if (chunk == NULL) - return (NULL); - run = (arena_run_t *)((uintptr_t)chunk + (arena_chunk_header_npages << - pagesize_2pow)); - /* Update page map. */ - arena_run_split(arena, run, size, large, zero); - return (run); -} - -static void -arena_purge(arena_t *arena) -{ - arena_chunk_t *chunk; - size_t i, npages; -#ifdef MALLOC_DEBUG - size_t ndirty = 0; - - rb_foreach_begin(arena_chunk_t, link_dirty, &arena->chunks_dirty, - chunk) { - ndirty += chunk->ndirty; - } rb_foreach_end(arena_chunk_t, link_dirty, &arena->chunks_dirty, chunk) - assert(ndirty == arena->ndirty); -#endif - assert(arena->ndirty > opt_dirty_max); - -#ifdef MALLOC_STATS - arena->stats.npurge++; -#endif - - /* - * Iterate downward through chunks until enough dirty memory has been - * purged. Terminate as soon as possible in order to minimize the - * number of system calls, even if a chunk has only been partially - * purged. - */ - while (arena->ndirty > (opt_dirty_max >> 1)) { - chunk = arena_chunk_tree_dirty_last(&arena->chunks_dirty); - assert(chunk != NULL); - - for (i = chunk_npages - 1; chunk->ndirty > 0; i--) { - assert(i >= arena_chunk_header_npages); - - if (chunk->map[i].bits & CHUNK_MAP_DIRTY) { - chunk->map[i].bits ^= CHUNK_MAP_DIRTY; - /* Find adjacent dirty run(s). */ - for (npages = 1; i > arena_chunk_header_npages - && (chunk->map[i - 1].bits & - CHUNK_MAP_DIRTY); npages++) { - i--; - chunk->map[i].bits ^= CHUNK_MAP_DIRTY; - } - chunk->ndirty -= npages; - arena->ndirty -= npages; - - madvise((void *)((uintptr_t)chunk + (i << - pagesize_2pow)), (npages << pagesize_2pow), - MADV_FREE); -#ifdef MALLOC_STATS - arena->stats.nmadvise++; - arena->stats.purged += npages; -#endif - if (arena->ndirty <= (opt_dirty_max >> 1)) - break; - } - } - - if (chunk->ndirty == 0) { - arena_chunk_tree_dirty_remove(&arena->chunks_dirty, - chunk); - } - } -} - -static void -arena_run_dalloc(arena_t *arena, arena_run_t *run, bool dirty) -{ - arena_chunk_t *chunk; - size_t size, run_ind, run_pages; - - chunk = (arena_chunk_t *)CHUNK_ADDR2BASE(run); - run_ind = (size_t)(((uintptr_t)run - (uintptr_t)chunk) - >> pagesize_2pow); - assert(run_ind >= arena_chunk_header_npages); - assert(run_ind < chunk_npages); - if ((chunk->map[run_ind].bits & CHUNK_MAP_LARGE) != 0) - size = chunk->map[run_ind].bits & ~pagesize_mask; - else - size = run->bin->run_size; - run_pages = (size >> pagesize_2pow); - - /* Mark pages as unallocated in the chunk map. */ - if (dirty) { - size_t i; - - for (i = 0; i < run_pages; i++) { - assert((chunk->map[run_ind + i].bits & CHUNK_MAP_DIRTY) - == 0); - chunk->map[run_ind + i].bits = CHUNK_MAP_DIRTY; - } - - if (chunk->ndirty == 0) { - arena_chunk_tree_dirty_insert(&arena->chunks_dirty, - chunk); - } - chunk->ndirty += run_pages; - arena->ndirty += run_pages; - } else { - size_t i; - - for (i = 0; i < run_pages; i++) { - chunk->map[run_ind + i].bits &= ~(CHUNK_MAP_LARGE | - CHUNK_MAP_ALLOCATED); - } - } - chunk->map[run_ind].bits = size | (chunk->map[run_ind].bits & - pagesize_mask); - chunk->map[run_ind+run_pages-1].bits = size | - (chunk->map[run_ind+run_pages-1].bits & pagesize_mask); - - /* Try to coalesce forward. */ - if (run_ind + run_pages < chunk_npages && - (chunk->map[run_ind+run_pages].bits & CHUNK_MAP_ALLOCATED) == 0) { - size_t nrun_size = chunk->map[run_ind+run_pages].bits & - ~pagesize_mask; - - /* - * Remove successor from runs_avail; the coalesced run is - * inserted later. - */ - arena_avail_tree_remove(&arena->runs_avail, - &chunk->map[run_ind+run_pages]); - - size += nrun_size; - run_pages = size >> pagesize_2pow; - - assert((chunk->map[run_ind+run_pages-1].bits & ~pagesize_mask) - == nrun_size); - chunk->map[run_ind].bits = size | (chunk->map[run_ind].bits & - pagesize_mask); - chunk->map[run_ind+run_pages-1].bits = size | - (chunk->map[run_ind+run_pages-1].bits & pagesize_mask); - } - - /* Try to coalesce backward. */ - if (run_ind > arena_chunk_header_npages && (chunk->map[run_ind-1].bits & - CHUNK_MAP_ALLOCATED) == 0) { - size_t prun_size = chunk->map[run_ind-1].bits & ~pagesize_mask; - - run_ind -= prun_size >> pagesize_2pow; - - /* - * Remove predecessor from runs_avail; the coalesced run is - * inserted later. - */ - arena_avail_tree_remove(&arena->runs_avail, - &chunk->map[run_ind]); - - size += prun_size; - run_pages = size >> pagesize_2pow; - - assert((chunk->map[run_ind].bits & ~pagesize_mask) == - prun_size); - chunk->map[run_ind].bits = size | (chunk->map[run_ind].bits & - pagesize_mask); - chunk->map[run_ind+run_pages-1].bits = size | - (chunk->map[run_ind+run_pages-1].bits & pagesize_mask); - } - - /* Insert into runs_avail, now that coalescing is complete. */ - arena_avail_tree_insert(&arena->runs_avail, &chunk->map[run_ind]); - - /* Deallocate chunk if it is now completely unused. */ - if ((chunk->map[arena_chunk_header_npages].bits & (~pagesize_mask | - CHUNK_MAP_ALLOCATED)) == arena_maxclass) - arena_chunk_dealloc(arena, chunk); - - /* Enforce opt_dirty_max. */ - if (arena->ndirty > opt_dirty_max) - arena_purge(arena); -} - -static void -arena_run_trim_head(arena_t *arena, arena_chunk_t *chunk, arena_run_t *run, - size_t oldsize, size_t newsize) -{ - size_t pageind = ((uintptr_t)run - (uintptr_t)chunk) >> pagesize_2pow; - size_t head_npages = (oldsize - newsize) >> pagesize_2pow; - - assert(oldsize > newsize); - - /* - * Update the chunk map so that arena_run_dalloc() can treat the - * leading run as separately allocated. - */ - chunk->map[pageind].bits = (oldsize - newsize) | CHUNK_MAP_LARGE | - CHUNK_MAP_ALLOCATED; - chunk->map[pageind+head_npages].bits = newsize | CHUNK_MAP_LARGE | - CHUNK_MAP_ALLOCATED; - - arena_run_dalloc(arena, run, false); -} - -static void -arena_run_trim_tail(arena_t *arena, arena_chunk_t *chunk, arena_run_t *run, - size_t oldsize, size_t newsize, bool dirty) -{ - size_t pageind = ((uintptr_t)run - (uintptr_t)chunk) >> pagesize_2pow; - size_t npages = newsize >> pagesize_2pow; - - assert(oldsize > newsize); - - /* - * Update the chunk map so that arena_run_dalloc() can treat the - * trailing run as separately allocated. - */ - chunk->map[pageind].bits = newsize | CHUNK_MAP_LARGE | - CHUNK_MAP_ALLOCATED; - chunk->map[pageind+npages].bits = (oldsize - newsize) | CHUNK_MAP_LARGE - | CHUNK_MAP_ALLOCATED; - - arena_run_dalloc(arena, (arena_run_t *)((uintptr_t)run + newsize), - dirty); -} - -static arena_run_t * -arena_bin_nonfull_run_get(arena_t *arena, arena_bin_t *bin) -{ - arena_chunk_map_t *mapelm; - arena_run_t *run; - unsigned i, remainder; - - /* Look for a usable run. */ - mapelm = arena_run_tree_first(&bin->runs); - if (mapelm != NULL) { - /* run is guaranteed to have available space. */ - arena_run_tree_remove(&bin->runs, mapelm); - run = (arena_run_t *)(mapelm->bits & ~pagesize_mask); -#ifdef MALLOC_STATS - bin->stats.reruns++; -#endif - return (run); - } - /* No existing runs have any space available. */ - - /* Allocate a new run. */ - run = arena_run_alloc(arena, bin->run_size, false, false); - if (run == NULL) - return (NULL); - - /* Initialize run internals. */ - run->bin = bin; - - for (i = 0; i < bin->regs_mask_nelms - 1; i++) - run->regs_mask[i] = UINT_MAX; - remainder = bin->nregs & ((1U << (SIZEOF_INT_2POW + 3)) - 1); - if (remainder == 0) - run->regs_mask[i] = UINT_MAX; - else { - /* The last element has spare bits that need to be unset. */ - run->regs_mask[i] = (UINT_MAX >> ((1U << (SIZEOF_INT_2POW + 3)) - - remainder)); - } - - run->regs_minelm = 0; - - run->nfree = bin->nregs; -#ifdef MALLOC_DEBUG - run->magic = ARENA_RUN_MAGIC; -#endif - -#ifdef MALLOC_STATS - bin->stats.nruns++; - bin->stats.curruns++; - if (bin->stats.curruns > bin->stats.highruns) - bin->stats.highruns = bin->stats.curruns; -#endif - return (run); -} - -/* bin->runcur must have space available before this function is called. */ -static inline void * -arena_bin_malloc_easy(arena_t *arena, arena_bin_t *bin, arena_run_t *run) -{ - void *ret; - - assert(run->magic == ARENA_RUN_MAGIC); - assert(run->nfree > 0); - - ret = arena_run_reg_alloc(run, bin); - assert(ret != NULL); - run->nfree--; - - return (ret); -} - -/* Re-fill bin->runcur, then call arena_bin_malloc_easy(). */ -static void * -arena_bin_malloc_hard(arena_t *arena, arena_bin_t *bin) -{ - - bin->runcur = arena_bin_nonfull_run_get(arena, bin); - if (bin->runcur == NULL) - return (NULL); - assert(bin->runcur->magic == ARENA_RUN_MAGIC); - assert(bin->runcur->nfree > 0); - - return (arena_bin_malloc_easy(arena, bin, bin->runcur)); -} - -/* - * Calculate bin->run_size such that it meets the following constraints: - * - * *) bin->run_size >= min_run_size - * *) bin->run_size <= arena_maxclass - * *) bin->run_size <= RUN_MAX_SMALL - * *) run header overhead <= RUN_MAX_OVRHD (or header overhead relaxed). - * - * bin->nregs, bin->regs_mask_nelms, and bin->reg0_offset are - * also calculated here, since these settings are all interdependent. - */ -static size_t -arena_bin_run_size_calc(arena_bin_t *bin, size_t min_run_size) -{ - size_t try_run_size, good_run_size; - unsigned good_nregs, good_mask_nelms, good_reg0_offset; - unsigned try_nregs, try_mask_nelms, try_reg0_offset; - - assert(min_run_size >= pagesize); - assert(min_run_size <= arena_maxclass); - assert(min_run_size <= RUN_MAX_SMALL); - - /* - * Calculate known-valid settings before entering the run_size - * expansion loop, so that the first part of the loop always copies - * valid settings. - * - * The do..while loop iteratively reduces the number of regions until - * the run header and the regions no longer overlap. A closed formula - * would be quite messy, since there is an interdependency between the - * header's mask length and the number of regions. - */ - try_run_size = min_run_size; - try_nregs = ((try_run_size - sizeof(arena_run_t)) / bin->reg_size) - + 1; /* Counter-act try_nregs-- in loop. */ - do { - try_nregs--; - try_mask_nelms = (try_nregs >> (SIZEOF_INT_2POW + 3)) + - ((try_nregs & ((1U << (SIZEOF_INT_2POW + 3)) - 1)) ? 1 : 0); - try_reg0_offset = try_run_size - (try_nregs * bin->reg_size); - } while (sizeof(arena_run_t) + (sizeof(unsigned) * (try_mask_nelms - 1)) - > try_reg0_offset); - - /* run_size expansion loop. */ - do { - /* - * Copy valid settings before trying more aggressive settings. - */ - good_run_size = try_run_size; - good_nregs = try_nregs; - good_mask_nelms = try_mask_nelms; - good_reg0_offset = try_reg0_offset; - - /* Try more aggressive settings. */ - try_run_size += pagesize; - try_nregs = ((try_run_size - sizeof(arena_run_t)) / - bin->reg_size) + 1; /* Counter-act try_nregs-- in loop. */ - do { - try_nregs--; - try_mask_nelms = (try_nregs >> (SIZEOF_INT_2POW + 3)) + - ((try_nregs & ((1U << (SIZEOF_INT_2POW + 3)) - 1)) ? - 1 : 0); - try_reg0_offset = try_run_size - (try_nregs * - bin->reg_size); - } while (sizeof(arena_run_t) + (sizeof(unsigned) * - (try_mask_nelms - 1)) > try_reg0_offset); - } while (try_run_size <= arena_maxclass && try_run_size <= RUN_MAX_SMALL - && RUN_MAX_OVRHD * (bin->reg_size << 3) > RUN_MAX_OVRHD_RELAX - && (try_reg0_offset << RUN_BFP) > RUN_MAX_OVRHD * try_run_size); - - assert(sizeof(arena_run_t) + (sizeof(unsigned) * (good_mask_nelms - 1)) - <= good_reg0_offset); - assert((good_mask_nelms << (SIZEOF_INT_2POW + 3)) >= good_nregs); - - /* Copy final settings. */ - bin->run_size = good_run_size; - bin->nregs = good_nregs; - bin->regs_mask_nelms = good_mask_nelms; - bin->reg0_offset = good_reg0_offset; - - return (good_run_size); -} - -#ifdef MALLOC_BALANCE -static inline void -arena_lock_balance(arena_t *arena) -{ - unsigned contention; - - contention = malloc_spin_lock(&arena->lock); - if (narenas > 1) { - /* - * Calculate the exponentially averaged contention for this - * arena. Due to integer math always rounding down, this value - * decays somewhat faster than normal. - */ - arena->contention = (((uint64_t)arena->contention - * (uint64_t)((1U << BALANCE_ALPHA_INV_2POW)-1)) - + (uint64_t)contention) >> BALANCE_ALPHA_INV_2POW; - if (arena->contention >= opt_balance_threshold) - arena_lock_balance_hard(arena); - } -} - -static void -arena_lock_balance_hard(arena_t *arena) -{ - uint32_t ind; - - arena->contention = 0; -#ifdef MALLOC_STATS - arena->stats.nbalance++; -#endif - ind = PRN(balance, narenas_2pow); - if (arenas[ind] != NULL) - arenas_map = arenas[ind]; - else { - malloc_spin_lock(&arenas_lock); - if (arenas[ind] != NULL) - arenas_map = arenas[ind]; - else - arenas_map = arenas_extend(ind); - malloc_spin_unlock(&arenas_lock); - } -} -#endif - -#ifdef MALLOC_MAG -static inline void * -mag_alloc(mag_t *mag) -{ - - if (mag->nrounds == 0) - return (NULL); - mag->nrounds--; - - return (mag->rounds[mag->nrounds]); -} - -static void -mag_load(mag_t *mag) -{ - arena_t *arena; - arena_bin_t *bin; - arena_run_t *run; - void *round; - size_t i; - - arena = choose_arena(); - bin = &arena->bins[mag->binind]; -#ifdef MALLOC_BALANCE - arena_lock_balance(arena); -#else - malloc_spin_lock(&arena->lock); -#endif - for (i = mag->nrounds; i < max_rounds; i++) { - if ((run = bin->runcur) != NULL && run->nfree > 0) - round = arena_bin_malloc_easy(arena, bin, run); - else - round = arena_bin_malloc_hard(arena, bin); - if (round == NULL) - break; - mag->rounds[i] = round; - } -#ifdef MALLOC_STATS - bin->stats.nmags++; - arena->stats.nmalloc_small += (i - mag->nrounds); - arena->stats.allocated_small += (i - mag->nrounds) * bin->reg_size; -#endif - malloc_spin_unlock(&arena->lock); - mag->nrounds = i; -} - -static inline void * -mag_rack_alloc(mag_rack_t *rack, size_t size, bool zero) -{ - void *ret; - bin_mags_t *bin_mags; - mag_t *mag; - size_t binind; - - binind = size2bin[size]; - assert(binind < nbins); - bin_mags = &rack->bin_mags[binind]; - - mag = bin_mags->curmag; - if (mag == NULL) { - /* Create an initial magazine for this size class. */ - assert(bin_mags->sparemag == NULL); - mag = mag_create(choose_arena(), binind); - if (mag == NULL) - return (NULL); - bin_mags->curmag = mag; - mag_load(mag); - } - - ret = mag_alloc(mag); - if (ret == NULL) { - if (bin_mags->sparemag != NULL) { - if (bin_mags->sparemag->nrounds > 0) { - /* Swap magazines. */ - bin_mags->curmag = bin_mags->sparemag; - bin_mags->sparemag = mag; - mag = bin_mags->curmag; - } else { - /* Reload the current magazine. */ - mag_load(mag); - } - } else { - /* Create a second magazine. */ - mag = mag_create(choose_arena(), binind); - if (mag == NULL) - return (NULL); - mag_load(mag); - bin_mags->sparemag = bin_mags->curmag; - bin_mags->curmag = mag; - } - ret = mag_alloc(mag); - if (ret == NULL) - return (NULL); - } - - if (zero == false) { - if (opt_junk) - memset(ret, 0xa5, size); - else if (opt_zero) - memset(ret, 0, size); - } else - memset(ret, 0, size); - - return (ret); -} -#endif - -static inline void * -arena_malloc_small(arena_t *arena, size_t size, bool zero) -{ - void *ret; - arena_bin_t *bin; - arena_run_t *run; - size_t binind; - - binind = size2bin[size]; - assert(binind < nbins); - bin = &arena->bins[binind]; - size = bin->reg_size; - -#ifdef MALLOC_BALANCE - arena_lock_balance(arena); -#else - malloc_spin_lock(&arena->lock); -#endif - if ((run = bin->runcur) != NULL && run->nfree > 0) - ret = arena_bin_malloc_easy(arena, bin, run); - else - ret = arena_bin_malloc_hard(arena, bin); - - if (ret == NULL) { - malloc_spin_unlock(&arena->lock); - return (NULL); - } - -#ifdef MALLOC_STATS - bin->stats.nrequests++; - arena->stats.nmalloc_small++; - arena->stats.allocated_small += size; -#endif - malloc_spin_unlock(&arena->lock); - - if (zero == false) { - if (opt_junk) - memset(ret, 0xa5, size); - else if (opt_zero) - memset(ret, 0, size); - } else - memset(ret, 0, size); - - return (ret); -} - -static void * -arena_malloc_large(arena_t *arena, size_t size, bool zero) -{ - void *ret; - - /* Large allocation. */ - size = PAGE_CEILING(size); -#ifdef MALLOC_BALANCE - arena_lock_balance(arena); -#else - malloc_spin_lock(&arena->lock); -#endif - ret = (void *)arena_run_alloc(arena, size, true, zero); - if (ret == NULL) { - malloc_spin_unlock(&arena->lock); - return (NULL); - } -#ifdef MALLOC_STATS - arena->stats.nmalloc_large++; - arena->stats.allocated_large += size; -#endif - malloc_spin_unlock(&arena->lock); - - if (zero == false) { - if (opt_junk) - memset(ret, 0xa5, size); - else if (opt_zero) - memset(ret, 0, size); - } - - return (ret); -} - -static inline void * -arena_malloc(arena_t *arena, size_t size, bool zero) -{ - - assert(arena != NULL); - assert(arena->magic == ARENA_MAGIC); - assert(size != 0); - assert(QUANTUM_CEILING(size) <= arena_maxclass); - - if (size <= bin_maxclass) { -#ifdef MALLOC_MAG - if (__isthreaded && opt_mag) { - mag_rack_t *rack = mag_rack; - if (rack == NULL) { - rack = mag_rack_create(arena); - if (rack == NULL) - return (NULL); - mag_rack = rack; - } - return (mag_rack_alloc(rack, size, zero)); - } else -#endif - return (arena_malloc_small(arena, size, zero)); - } else - return (arena_malloc_large(arena, size, zero)); -} - -static inline void * -imalloc(size_t size) -{ - - assert(size != 0); - - if (size <= arena_maxclass) - return (arena_malloc(choose_arena(), size, false)); - else - return (huge_malloc(size, false)); -} - -static inline void * -icalloc(size_t size) -{ - - if (size <= arena_maxclass) - return (arena_malloc(choose_arena(), size, true)); - else - return (huge_malloc(size, true)); -} - -/* Only handles large allocations that require more than page alignment. */ -static void * -arena_palloc(arena_t *arena, size_t alignment, size_t size, size_t alloc_size) -{ - void *ret; - size_t offset; - arena_chunk_t *chunk; - - assert((size & pagesize_mask) == 0); - assert((alignment & pagesize_mask) == 0); - -#ifdef MALLOC_BALANCE - arena_lock_balance(arena); -#else - malloc_spin_lock(&arena->lock); -#endif - ret = (void *)arena_run_alloc(arena, alloc_size, true, false); - if (ret == NULL) { - malloc_spin_unlock(&arena->lock); - return (NULL); - } - - chunk = (arena_chunk_t *)CHUNK_ADDR2BASE(ret); - - offset = (uintptr_t)ret & (alignment - 1); - assert((offset & pagesize_mask) == 0); - assert(offset < alloc_size); - if (offset == 0) - arena_run_trim_tail(arena, chunk, ret, alloc_size, size, false); - else { - size_t leadsize, trailsize; - - leadsize = alignment - offset; - if (leadsize > 0) { - arena_run_trim_head(arena, chunk, ret, alloc_size, - alloc_size - leadsize); - ret = (void *)((uintptr_t)ret + leadsize); - } - - trailsize = alloc_size - leadsize - size; - if (trailsize != 0) { - /* Trim trailing space. */ - assert(trailsize < alloc_size); - arena_run_trim_tail(arena, chunk, ret, size + trailsize, - size, false); - } - } - -#ifdef MALLOC_STATS - arena->stats.nmalloc_large++; - arena->stats.allocated_large += size; -#endif - malloc_spin_unlock(&arena->lock); - - if (opt_junk) - memset(ret, 0xa5, size); - else if (opt_zero) - memset(ret, 0, size); - return (ret); -} - -static inline void * -ipalloc(size_t alignment, size_t size) -{ - void *ret; - size_t ceil_size; - - /* - * Round size up to the nearest multiple of alignment. - * - * This done, we can take advantage of the fact that for each small - * size class, every object is aligned at the smallest power of two - * that is non-zero in the base two representation of the size. For - * example: - * - * Size | Base 2 | Minimum alignment - * -----+----------+------------------ - * 96 | 1100000 | 32 - * 144 | 10100000 | 32 - * 192 | 11000000 | 64 - * - * Depending on runtime settings, it is possible that arena_malloc() - * will further round up to a power of two, but that never causes - * correctness issues. - */ - ceil_size = (size + (alignment - 1)) & (-alignment); - /* - * (ceil_size < size) protects against the combination of maximal - * alignment and size greater than maximal alignment. - */ - if (ceil_size < size) { - /* size_t overflow. */ - return (NULL); - } - - if (ceil_size <= pagesize || (alignment <= pagesize - && ceil_size <= arena_maxclass)) - ret = arena_malloc(choose_arena(), ceil_size, false); - else { - size_t run_size; - - /* - * We can't achieve subpage alignment, so round up alignment - * permanently; it makes later calculations simpler. - */ - alignment = PAGE_CEILING(alignment); - ceil_size = PAGE_CEILING(size); - /* - * (ceil_size < size) protects against very large sizes within - * pagesize of SIZE_T_MAX. - * - * (ceil_size + alignment < ceil_size) protects against the - * combination of maximal alignment and ceil_size large enough - * to cause overflow. This is similar to the first overflow - * check above, but it needs to be repeated due to the new - * ceil_size value, which may now be *equal* to maximal - * alignment, whereas before we only detected overflow if the - * original size was *greater* than maximal alignment. - */ - if (ceil_size < size || ceil_size + alignment < ceil_size) { - /* size_t overflow. */ - return (NULL); - } - - /* - * Calculate the size of the over-size run that arena_palloc() - * would need to allocate in order to guarantee the alignment. - */ - if (ceil_size >= alignment) - run_size = ceil_size + alignment - pagesize; - else { - /* - * It is possible that (alignment << 1) will cause - * overflow, but it doesn't matter because we also - * subtract pagesize, which in the case of overflow - * leaves us with a very large run_size. That causes - * the first conditional below to fail, which means - * that the bogus run_size value never gets used for - * anything important. - */ - run_size = (alignment << 1) - pagesize; - } - - if (run_size <= arena_maxclass) { - ret = arena_palloc(choose_arena(), alignment, ceil_size, - run_size); - } else if (alignment <= chunksize) - ret = huge_malloc(ceil_size, false); - else - ret = huge_palloc(alignment, ceil_size); - } - - assert(((uintptr_t)ret & (alignment - 1)) == 0); - return (ret); -} - -/* Return the size of the allocation pointed to by ptr. */ -static size_t -arena_salloc(const void *ptr) -{ - size_t ret; - arena_chunk_t *chunk; - size_t pageind, mapbits; - - assert(ptr != NULL); - assert(CHUNK_ADDR2BASE(ptr) != ptr); - - chunk = (arena_chunk_t *)CHUNK_ADDR2BASE(ptr); - pageind = (((uintptr_t)ptr - (uintptr_t)chunk) >> pagesize_2pow); - mapbits = chunk->map[pageind].bits; - assert((mapbits & CHUNK_MAP_ALLOCATED) != 0); - if ((mapbits & CHUNK_MAP_LARGE) == 0) { - arena_run_t *run = (arena_run_t *)(mapbits & ~pagesize_mask); - assert(run->magic == ARENA_RUN_MAGIC); - ret = run->bin->reg_size; - } else { - ret = mapbits & ~pagesize_mask; - assert(ret != 0); - } - - return (ret); -} - -static inline size_t -isalloc(const void *ptr) -{ - size_t ret; - arena_chunk_t *chunk; - - assert(ptr != NULL); - - chunk = (arena_chunk_t *)CHUNK_ADDR2BASE(ptr); - if (chunk != ptr) { - /* Region. */ - assert(chunk->arena->magic == ARENA_MAGIC); - - ret = arena_salloc(ptr); - } else { - extent_node_t *node, key; - - /* Chunk (huge allocation). */ - - malloc_mutex_lock(&huge_mtx); - - /* Extract from tree of huge allocations. */ - key.addr = __DECONST(void *, ptr); - node = extent_tree_ad_search(&huge, &key); - assert(node != NULL); - - ret = node->size; - - malloc_mutex_unlock(&huge_mtx); - } - - return (ret); -} - -static inline void -arena_dalloc_small(arena_t *arena, arena_chunk_t *chunk, void *ptr, - arena_chunk_map_t *mapelm) -{ - arena_run_t *run; - arena_bin_t *bin; - size_t size; - - run = (arena_run_t *)(mapelm->bits & ~pagesize_mask); - assert(run->magic == ARENA_RUN_MAGIC); - bin = run->bin; - size = bin->reg_size; - - if (opt_junk) - memset(ptr, 0x5a, size); - - arena_run_reg_dalloc(run, bin, ptr, size); - run->nfree++; - - if (run->nfree == bin->nregs) { - /* Deallocate run. */ - if (run == bin->runcur) - bin->runcur = NULL; - else if (bin->nregs != 1) { - size_t run_pageind = (((uintptr_t)run - - (uintptr_t)chunk)) >> pagesize_2pow; - arena_chunk_map_t *run_mapelm = - &chunk->map[run_pageind]; - /* - * This block's conditional is necessary because if the - * run only contains one region, then it never gets - * inserted into the non-full runs tree. - */ - arena_run_tree_remove(&bin->runs, run_mapelm); - } -#ifdef MALLOC_DEBUG - run->magic = 0; -#endif - arena_run_dalloc(arena, run, true); -#ifdef MALLOC_STATS - bin->stats.curruns--; -#endif - } else if (run->nfree == 1 && run != bin->runcur) { - /* - * Make sure that bin->runcur always refers to the lowest - * non-full run, if one exists. - */ - if (bin->runcur == NULL) - bin->runcur = run; - else if ((uintptr_t)run < (uintptr_t)bin->runcur) { - /* Switch runcur. */ - if (bin->runcur->nfree > 0) { - arena_chunk_t *runcur_chunk = - CHUNK_ADDR2BASE(bin->runcur); - size_t runcur_pageind = - (((uintptr_t)bin->runcur - - (uintptr_t)runcur_chunk)) >> pagesize_2pow; - arena_chunk_map_t *runcur_mapelm = - &runcur_chunk->map[runcur_pageind]; - - /* Insert runcur. */ - arena_run_tree_insert(&bin->runs, - runcur_mapelm); - } - bin->runcur = run; - } else { - size_t run_pageind = (((uintptr_t)run - - (uintptr_t)chunk)) >> pagesize_2pow; - arena_chunk_map_t *run_mapelm = - &chunk->map[run_pageind]; - - assert(arena_run_tree_search(&bin->runs, run_mapelm) == - NULL); - arena_run_tree_insert(&bin->runs, run_mapelm); - } - } -#ifdef MALLOC_STATS - arena->stats.allocated_small -= size; - arena->stats.ndalloc_small++; -#endif -} - -#ifdef MALLOC_MAG -static void -mag_unload(mag_t *mag) -{ - arena_chunk_t *chunk; - arena_t *arena; - void *round; - size_t i, ndeferred, nrounds; - - for (ndeferred = mag->nrounds; ndeferred > 0;) { - nrounds = ndeferred; - /* Lock the arena associated with the first round. */ - chunk = (arena_chunk_t *)CHUNK_ADDR2BASE(mag->rounds[0]); - arena = chunk->arena; -#ifdef MALLOC_BALANCE - arena_lock_balance(arena); -#else - malloc_spin_lock(&arena->lock); -#endif - /* Deallocate every round that belongs to the locked arena. */ - for (i = ndeferred = 0; i < nrounds; i++) { - round = mag->rounds[i]; - chunk = (arena_chunk_t *)CHUNK_ADDR2BASE(round); - if (chunk->arena == arena) { - size_t pageind = (((uintptr_t)round - - (uintptr_t)chunk) >> pagesize_2pow); - arena_chunk_map_t *mapelm = - &chunk->map[pageind]; - arena_dalloc_small(arena, chunk, round, mapelm); - } else { - /* - * This round was allocated via a different - * arena than the one that is currently locked. - * Stash the round, so that it can be handled - * in a future pass. - */ - mag->rounds[ndeferred] = round; - ndeferred++; - } - } - malloc_spin_unlock(&arena->lock); - } - - mag->nrounds = 0; -} - -static inline void -mag_rack_dalloc(mag_rack_t *rack, void *ptr) -{ - arena_t *arena; - arena_chunk_t *chunk; - arena_run_t *run; - arena_bin_t *bin; - bin_mags_t *bin_mags; - mag_t *mag; - size_t pageind, binind; - arena_chunk_map_t *mapelm; - - chunk = (arena_chunk_t *)CHUNK_ADDR2BASE(ptr); - arena = chunk->arena; - pageind = (((uintptr_t)ptr - (uintptr_t)chunk) >> pagesize_2pow); - mapelm = &chunk->map[pageind]; - run = (arena_run_t *)(mapelm->bits & ~pagesize_mask); - assert(run->magic == ARENA_RUN_MAGIC); - bin = run->bin; - binind = ((uintptr_t)bin - (uintptr_t)&arena->bins) / - sizeof(arena_bin_t); - assert(binind < nbins); - - if (opt_junk) - memset(ptr, 0x5a, arena->bins[binind].reg_size); - - bin_mags = &rack->bin_mags[binind]; - mag = bin_mags->curmag; - if (mag == NULL) { - /* Create an initial magazine for this size class. */ - assert(bin_mags->sparemag == NULL); - mag = mag_create(choose_arena(), binind); - if (mag == NULL) { - malloc_spin_lock(&arena->lock); - arena_dalloc_small(arena, chunk, ptr, mapelm); - malloc_spin_unlock(&arena->lock); - return; - } - bin_mags->curmag = mag; - } - - if (mag->nrounds == max_rounds) { - if (bin_mags->sparemag != NULL) { - if (bin_mags->sparemag->nrounds < max_rounds) { - /* Swap magazines. */ - bin_mags->curmag = bin_mags->sparemag; - bin_mags->sparemag = mag; - mag = bin_mags->curmag; - } else { - /* Unload the current magazine. */ - mag_unload(mag); - } - } else { - /* Create a second magazine. */ - mag = mag_create(choose_arena(), binind); - if (mag == NULL) { - mag = rack->bin_mags[binind].curmag; - mag_unload(mag); - } else { - bin_mags->sparemag = bin_mags->curmag; - bin_mags->curmag = mag; - } - } - assert(mag->nrounds < max_rounds); - } - mag->rounds[mag->nrounds] = ptr; - mag->nrounds++; -} -#endif - -static void -arena_dalloc_large(arena_t *arena, arena_chunk_t *chunk, void *ptr) -{ - /* Large allocation. */ - malloc_spin_lock(&arena->lock); - -#ifndef MALLOC_STATS - if (opt_junk) -#endif - { - size_t pageind = ((uintptr_t)ptr - (uintptr_t)chunk) >> - pagesize_2pow; - size_t size = chunk->map[pageind].bits & ~pagesize_mask; - -#ifdef MALLOC_STATS - if (opt_junk) -#endif - memset(ptr, 0x5a, size); -#ifdef MALLOC_STATS - arena->stats.allocated_large -= size; -#endif - } -#ifdef MALLOC_STATS - arena->stats.ndalloc_large++; -#endif - - arena_run_dalloc(arena, (arena_run_t *)ptr, true); - malloc_spin_unlock(&arena->lock); -} - -static inline void -arena_dalloc(arena_t *arena, arena_chunk_t *chunk, void *ptr) -{ - size_t pageind; - arena_chunk_map_t *mapelm; - - assert(arena != NULL); - assert(arena->magic == ARENA_MAGIC); - assert(chunk->arena == arena); - assert(ptr != NULL); - assert(CHUNK_ADDR2BASE(ptr) != ptr); - - pageind = (((uintptr_t)ptr - (uintptr_t)chunk) >> pagesize_2pow); - mapelm = &chunk->map[pageind]; - assert((mapelm->bits & CHUNK_MAP_ALLOCATED) != 0); - if ((mapelm->bits & CHUNK_MAP_LARGE) == 0) { - /* Small allocation. */ -#ifdef MALLOC_MAG - if (__isthreaded && opt_mag) { - mag_rack_t *rack = mag_rack; - if (rack == NULL) { - rack = mag_rack_create(arena); - if (rack == NULL) { - malloc_spin_lock(&arena->lock); - arena_dalloc_small(arena, chunk, ptr, - mapelm); - malloc_spin_unlock(&arena->lock); - } - mag_rack = rack; - } - mag_rack_dalloc(rack, ptr); - } else { -#endif - malloc_spin_lock(&arena->lock); - arena_dalloc_small(arena, chunk, ptr, mapelm); - malloc_spin_unlock(&arena->lock); -#ifdef MALLOC_MAG - } -#endif - } else - arena_dalloc_large(arena, chunk, ptr); -} - -static inline void -idalloc(void *ptr) -{ - arena_chunk_t *chunk; - - assert(ptr != NULL); - - chunk = (arena_chunk_t *)CHUNK_ADDR2BASE(ptr); - if (chunk != ptr) - arena_dalloc(chunk->arena, chunk, ptr); - else - huge_dalloc(ptr); -} - -static void -arena_ralloc_large_shrink(arena_t *arena, arena_chunk_t *chunk, void *ptr, - size_t size, size_t oldsize) -{ - - assert(size < oldsize); - - /* - * Shrink the run, and make trailing pages available for other - * allocations. - */ -#ifdef MALLOC_BALANCE - arena_lock_balance(arena); -#else - malloc_spin_lock(&arena->lock); -#endif - arena_run_trim_tail(arena, chunk, (arena_run_t *)ptr, oldsize, size, - true); -#ifdef MALLOC_STATS - arena->stats.allocated_large -= oldsize - size; -#endif - malloc_spin_unlock(&arena->lock); -} - -static bool -arena_ralloc_large_grow(arena_t *arena, arena_chunk_t *chunk, void *ptr, - size_t size, size_t oldsize) -{ - size_t pageind = ((uintptr_t)ptr - (uintptr_t)chunk) >> pagesize_2pow; - size_t npages = oldsize >> pagesize_2pow; - - assert(oldsize == (chunk->map[pageind].bits & ~pagesize_mask)); - - /* Try to extend the run. */ - assert(size > oldsize); -#ifdef MALLOC_BALANCE - arena_lock_balance(arena); -#else - malloc_spin_lock(&arena->lock); -#endif - if (pageind + npages < chunk_npages && (chunk->map[pageind+npages].bits - & CHUNK_MAP_ALLOCATED) == 0 && (chunk->map[pageind+npages].bits & - ~pagesize_mask) >= size - oldsize) { - /* - * The next run is available and sufficiently large. Split the - * following run, then merge the first part with the existing - * allocation. - */ - arena_run_split(arena, (arena_run_t *)((uintptr_t)chunk + - ((pageind+npages) << pagesize_2pow)), size - oldsize, true, - false); - - chunk->map[pageind].bits = size | CHUNK_MAP_LARGE | - CHUNK_MAP_ALLOCATED; - chunk->map[pageind+npages].bits = CHUNK_MAP_LARGE | - CHUNK_MAP_ALLOCATED; - -#ifdef MALLOC_STATS - arena->stats.allocated_large += size - oldsize; -#endif - malloc_spin_unlock(&arena->lock); - return (false); - } - malloc_spin_unlock(&arena->lock); - - return (true); -} - -/* - * Try to resize a large allocation, in order to avoid copying. This will - * always fail if growing an object, and the following run is already in use. - */ -static bool -arena_ralloc_large(void *ptr, size_t size, size_t oldsize) -{ - size_t psize; - - psize = PAGE_CEILING(size); - if (psize == oldsize) { - /* Same size class. */ - if (opt_junk && size < oldsize) { - memset((void *)((uintptr_t)ptr + size), 0x5a, oldsize - - size); - } - return (false); - } else { - arena_chunk_t *chunk; - arena_t *arena; - - chunk = (arena_chunk_t *)CHUNK_ADDR2BASE(ptr); - arena = chunk->arena; - assert(arena->magic == ARENA_MAGIC); - - if (psize < oldsize) { - /* Fill before shrinking in order avoid a race. */ - if (opt_junk) { - memset((void *)((uintptr_t)ptr + size), 0x5a, - oldsize - size); - } - arena_ralloc_large_shrink(arena, chunk, ptr, psize, - oldsize); - return (false); - } else { - bool ret = arena_ralloc_large_grow(arena, chunk, ptr, - psize, oldsize); - if (ret == false && opt_zero) { - memset((void *)((uintptr_t)ptr + oldsize), 0, - size - oldsize); - } - return (ret); - } - } -} - -static void * -arena_ralloc(void *ptr, size_t size, size_t oldsize) -{ - void *ret; - size_t copysize; - - /* Try to avoid moving the allocation. */ - if (size <= bin_maxclass) { - if (oldsize <= bin_maxclass && size2bin[size] == - size2bin[oldsize]) - goto IN_PLACE; - } else { - if (oldsize > bin_maxclass && oldsize <= arena_maxclass) { - assert(size > bin_maxclass); - if (arena_ralloc_large(ptr, size, oldsize) == false) - return (ptr); - } - } - - /* - * If we get here, then size and oldsize are different enough that we - * need to move the object. In that case, fall back to allocating new - * space and copying. - */ - ret = arena_malloc(choose_arena(), size, false); - if (ret == NULL) - return (NULL); - - /* Junk/zero-filling were already done by arena_malloc(). */ - copysize = (size < oldsize) ? size : oldsize; - memcpy(ret, ptr, copysize); - idalloc(ptr); - return (ret); -IN_PLACE: - if (opt_junk && size < oldsize) - memset((void *)((uintptr_t)ptr + size), 0x5a, oldsize - size); - else if (opt_zero && size > oldsize) - memset((void *)((uintptr_t)ptr + oldsize), 0, size - oldsize); - return (ptr); -} - -static inline void * -iralloc(void *ptr, size_t size) -{ - size_t oldsize; - - assert(ptr != NULL); - assert(size != 0); - - oldsize = isalloc(ptr); - - if (size <= arena_maxclass) - return (arena_ralloc(ptr, size, oldsize)); - else - return (huge_ralloc(ptr, size, oldsize)); -} - -static bool -arena_new(arena_t *arena) -{ - unsigned i; - arena_bin_t *bin; - size_t prev_run_size; - - if (malloc_spin_init(&arena->lock)) - return (true); - -#ifdef MALLOC_STATS - memset(&arena->stats, 0, sizeof(arena_stats_t)); -#endif - - /* Initialize chunks. */ - arena_chunk_tree_dirty_new(&arena->chunks_dirty); - arena->spare = NULL; - - arena->ndirty = 0; - - arena_avail_tree_new(&arena->runs_avail); - -#ifdef MALLOC_BALANCE - arena->contention = 0; -#endif - - /* Initialize bins. */ - prev_run_size = pagesize; - - i = 0; -#ifdef MALLOC_TINY - /* (2^n)-spaced tiny bins. */ - for (; i < ntbins; i++) { - bin = &arena->bins[i]; - bin->runcur = NULL; - arena_run_tree_new(&bin->runs); - - bin->reg_size = (1U << (TINY_MIN_2POW + i)); - - prev_run_size = arena_bin_run_size_calc(bin, prev_run_size); - -#ifdef MALLOC_STATS - memset(&bin->stats, 0, sizeof(malloc_bin_stats_t)); -#endif - } -#endif - - /* Quantum-spaced bins. */ - for (; i < ntbins + nqbins; i++) { - bin = &arena->bins[i]; - bin->runcur = NULL; - arena_run_tree_new(&bin->runs); - - bin->reg_size = (i - ntbins + 1) << QUANTUM_2POW; - - prev_run_size = arena_bin_run_size_calc(bin, prev_run_size); - -#ifdef MALLOC_STATS - memset(&bin->stats, 0, sizeof(malloc_bin_stats_t)); -#endif - } - - /* Cacheline-spaced bins. */ - for (; i < ntbins + nqbins + ncbins; i++) { - bin = &arena->bins[i]; - bin->runcur = NULL; - arena_run_tree_new(&bin->runs); - - bin->reg_size = cspace_min + ((i - (ntbins + nqbins)) << - CACHELINE_2POW); - - prev_run_size = arena_bin_run_size_calc(bin, prev_run_size); - -#ifdef MALLOC_STATS - memset(&bin->stats, 0, sizeof(malloc_bin_stats_t)); -#endif - } - - /* Subpage-spaced bins. */ - for (; i < nbins; i++) { - bin = &arena->bins[i]; - bin->runcur = NULL; - arena_run_tree_new(&bin->runs); - - bin->reg_size = sspace_min + ((i - (ntbins + nqbins + ncbins)) - << SUBPAGE_2POW); - - prev_run_size = arena_bin_run_size_calc(bin, prev_run_size); - -#ifdef MALLOC_STATS - memset(&bin->stats, 0, sizeof(malloc_bin_stats_t)); -#endif - } - -#ifdef MALLOC_DEBUG - arena->magic = ARENA_MAGIC; -#endif - - return (false); -} - -/* Create a new arena and insert it into the arenas array at index ind. */ -static arena_t * -arenas_extend(unsigned ind) -{ - arena_t *ret; - - /* Allocate enough space for trailing bins. */ - ret = (arena_t *)base_alloc(sizeof(arena_t) - + (sizeof(arena_bin_t) * (nbins - 1))); - if (ret != NULL && arena_new(ret) == false) { - arenas[ind] = ret; - return (ret); - } - /* Only reached if there is an OOM error. */ - - /* - * OOM here is quite inconvenient to propagate, since dealing with it - * would require a check for failure in the fast path. Instead, punt - * by using arenas[0]. In practice, this is an extremely unlikely - * failure. - */ - _malloc_message(_getprogname(), - ": (malloc) Error initializing arena\n", "", ""); - if (opt_abort) - abort(); - - return (arenas[0]); -} - -#ifdef MALLOC_MAG -static mag_t * -mag_create(arena_t *arena, size_t binind) -{ - mag_t *ret; - - if (sizeof(mag_t) + (sizeof(void *) * (max_rounds - 1)) <= - bin_maxclass) { - ret = arena_malloc_small(arena, sizeof(mag_t) + (sizeof(void *) - * (max_rounds - 1)), false); - } else { - ret = imalloc(sizeof(mag_t) + (sizeof(void *) * (max_rounds - - 1))); - } - if (ret == NULL) - return (NULL); - ret->binind = binind; - ret->nrounds = 0; - - return (ret); -} - -static void -mag_destroy(mag_t *mag) -{ - arena_t *arena; - arena_chunk_t *chunk; - size_t pageind; - arena_chunk_map_t *mapelm; - - chunk = CHUNK_ADDR2BASE(mag); - arena = chunk->arena; - pageind = (((uintptr_t)mag - (uintptr_t)chunk) >> pagesize_2pow); - mapelm = &chunk->map[pageind]; - - assert(mag->nrounds == 0); - if (sizeof(mag_t) + (sizeof(void *) * (max_rounds - 1)) <= - bin_maxclass) { - malloc_spin_lock(&arena->lock); - arena_dalloc_small(arena, chunk, mag, mapelm); - malloc_spin_unlock(&arena->lock); - } else - idalloc(mag); -} - -static mag_rack_t * -mag_rack_create(arena_t *arena) -{ - - assert(sizeof(mag_rack_t) + (sizeof(bin_mags_t *) * (nbins - 1)) <= - bin_maxclass); - return (arena_malloc_small(arena, sizeof(mag_rack_t) + - (sizeof(bin_mags_t) * (nbins - 1)), true)); -} - -static void -mag_rack_destroy(mag_rack_t *rack) -{ - arena_t *arena; - arena_chunk_t *chunk; - bin_mags_t *bin_mags; - size_t i, pageind; - arena_chunk_map_t *mapelm; - - for (i = 0; i < nbins; i++) { - bin_mags = &rack->bin_mags[i]; - if (bin_mags->curmag != NULL) { - assert(bin_mags->curmag->binind == i); - mag_unload(bin_mags->curmag); - mag_destroy(bin_mags->curmag); - } - if (bin_mags->sparemag != NULL) { - assert(bin_mags->sparemag->binind == i); - mag_unload(bin_mags->sparemag); - mag_destroy(bin_mags->sparemag); - } - } - - chunk = CHUNK_ADDR2BASE(rack); - arena = chunk->arena; - pageind = (((uintptr_t)rack - (uintptr_t)chunk) >> pagesize_2pow); - mapelm = &chunk->map[pageind]; - - malloc_spin_lock(&arena->lock); - arena_dalloc_small(arena, chunk, rack, mapelm); - malloc_spin_unlock(&arena->lock); -} -#endif - -/* - * End arena. - */ -/******************************************************************************/ -/* - * Begin general internal functions. - */ - -static void * -huge_malloc(size_t size, bool zero) -{ - void *ret; - size_t csize; - extent_node_t *node; - - /* Allocate one or more contiguous chunks for this request. */ - - csize = CHUNK_CEILING(size); - if (csize == 0) { - /* size is large enough to cause size_t wrap-around. */ - return (NULL); - } - - /* Allocate an extent node with which to track the chunk. */ - node = base_node_alloc(); - if (node == NULL) - return (NULL); - - ret = chunk_alloc(csize, zero); - if (ret == NULL) { - base_node_dealloc(node); - return (NULL); - } - - /* Insert node into huge. */ - node->addr = ret; - node->size = csize; - - malloc_mutex_lock(&huge_mtx); - extent_tree_ad_insert(&huge, node); -#ifdef MALLOC_STATS - huge_nmalloc++; - huge_allocated += csize; -#endif - malloc_mutex_unlock(&huge_mtx); - - if (zero == false) { - if (opt_junk) - memset(ret, 0xa5, csize); - else if (opt_zero) - memset(ret, 0, csize); - } - - return (ret); -} - -/* Only handles large allocations that require more than chunk alignment. */ -static void * -huge_palloc(size_t alignment, size_t size) -{ - void *ret; - size_t alloc_size, chunk_size, offset; - extent_node_t *node; - - /* - * This allocation requires alignment that is even larger than chunk - * alignment. This means that huge_malloc() isn't good enough. - * - * Allocate almost twice as many chunks as are demanded by the size or - * alignment, in order to assure the alignment can be achieved, then - * unmap leading and trailing chunks. - */ - assert(alignment >= chunksize); - - chunk_size = CHUNK_CEILING(size); - - if (size >= alignment) - alloc_size = chunk_size + alignment - chunksize; - else - alloc_size = (alignment << 1) - chunksize; - - /* Allocate an extent node with which to track the chunk. */ - node = base_node_alloc(); - if (node == NULL) - return (NULL); - - ret = chunk_alloc(alloc_size, false); - if (ret == NULL) { - base_node_dealloc(node); - return (NULL); - } - - offset = (uintptr_t)ret & (alignment - 1); - assert((offset & chunksize_mask) == 0); - assert(offset < alloc_size); - if (offset == 0) { - /* Trim trailing space. */ - chunk_dealloc((void *)((uintptr_t)ret + chunk_size), alloc_size - - chunk_size); - } else { - size_t trailsize; - - /* Trim leading space. */ - chunk_dealloc(ret, alignment - offset); - - ret = (void *)((uintptr_t)ret + (alignment - offset)); - - trailsize = alloc_size - (alignment - offset) - chunk_size; - if (trailsize != 0) { - /* Trim trailing space. */ - assert(trailsize < alloc_size); - chunk_dealloc((void *)((uintptr_t)ret + chunk_size), - trailsize); - } - } - - /* Insert node into huge. */ - node->addr = ret; - node->size = chunk_size; - - malloc_mutex_lock(&huge_mtx); - extent_tree_ad_insert(&huge, node); -#ifdef MALLOC_STATS - huge_nmalloc++; - huge_allocated += chunk_size; -#endif - malloc_mutex_unlock(&huge_mtx); - - if (opt_junk) - memset(ret, 0xa5, chunk_size); - else if (opt_zero) - memset(ret, 0, chunk_size); - - return (ret); -} - -static void * -huge_ralloc(void *ptr, size_t size, size_t oldsize) -{ - void *ret; - size_t copysize; - - /* Avoid moving the allocation if the size class would not change. */ - if (oldsize > arena_maxclass && - CHUNK_CEILING(size) == CHUNK_CEILING(oldsize)) { - if (opt_junk && size < oldsize) { - memset((void *)((uintptr_t)ptr + size), 0x5a, oldsize - - size); - } else if (opt_zero && size > oldsize) { - memset((void *)((uintptr_t)ptr + oldsize), 0, size - - oldsize); - } - return (ptr); - } - - /* - * If we get here, then size and oldsize are different enough that we - * need to use a different size class. In that case, fall back to - * allocating new space and copying. - */ - ret = huge_malloc(size, false); - if (ret == NULL) - return (NULL); - - copysize = (size < oldsize) ? size : oldsize; - memcpy(ret, ptr, copysize); - idalloc(ptr); - return (ret); -} - -static void -huge_dalloc(void *ptr) -{ - extent_node_t *node, key; - - malloc_mutex_lock(&huge_mtx); - - /* Extract from tree of huge allocations. */ - key.addr = ptr; - node = extent_tree_ad_search(&huge, &key); - assert(node != NULL); - assert(node->addr == ptr); - extent_tree_ad_remove(&huge, node); - -#ifdef MALLOC_STATS - huge_ndalloc++; - huge_allocated -= node->size; -#endif - - malloc_mutex_unlock(&huge_mtx); - - /* Unmap chunk. */ -#ifdef MALLOC_DSS - if (opt_dss && opt_junk) - memset(node->addr, 0x5a, node->size); -#endif - chunk_dealloc(node->addr, node->size); - - base_node_dealloc(node); -} - -static void -malloc_print_stats(void) -{ - - if (opt_print_stats) { - char s[UMAX2S_BUFSIZE]; - _malloc_message("___ Begin malloc statistics ___\n", "", "", - ""); - _malloc_message("Assertions ", -#ifdef NDEBUG - "disabled", -#else - "enabled", -#endif - "\n", ""); - _malloc_message("Boolean MALLOC_OPTIONS: ", - opt_abort ? "A" : "a", "", ""); -#ifdef MALLOC_DSS - _malloc_message(opt_dss ? "D" : "d", "", "", ""); -#endif -#ifdef MALLOC_MAG - _malloc_message(opt_mag ? "G" : "g", "", "", ""); -#endif - _malloc_message(opt_junk ? "J" : "j", "", "", ""); -#ifdef MALLOC_DSS - _malloc_message(opt_mmap ? "M" : "m", "", "", ""); -#endif - _malloc_message(opt_utrace ? "PU" : "Pu", - opt_sysv ? "V" : "v", - opt_xmalloc ? "X" : "x", - opt_zero ? "Z\n" : "z\n"); - - _malloc_message("CPUs: ", umax2s(ncpus, s), "\n", ""); - _malloc_message("Max arenas: ", umax2s(narenas, s), "\n", ""); -#ifdef MALLOC_BALANCE - _malloc_message("Arena balance threshold: ", - umax2s(opt_balance_threshold, s), "\n", ""); -#endif - _malloc_message("Pointer size: ", umax2s(sizeof(void *), s), - "\n", ""); - _malloc_message("Quantum size: ", umax2s(QUANTUM, s), "\n", ""); - _malloc_message("Cacheline size (assumed): ", umax2s(CACHELINE, - s), "\n", ""); -#ifdef MALLOC_TINY - _malloc_message("Tiny 2^n-spaced sizes: [", umax2s((1U << - TINY_MIN_2POW), s), "..", ""); - _malloc_message(umax2s((qspace_min >> 1), s), "]\n", "", ""); -#endif - _malloc_message("Quantum-spaced sizes: [", umax2s(qspace_min, - s), "..", ""); - _malloc_message(umax2s(qspace_max, s), "]\n", "", ""); - _malloc_message("Cacheline-spaced sizes: [", umax2s(cspace_min, - s), "..", ""); - _malloc_message(umax2s(cspace_max, s), "]\n", "", ""); - _malloc_message("Subpage-spaced sizes: [", umax2s(sspace_min, - s), "..", ""); - _malloc_message(umax2s(sspace_max, s), "]\n", "", ""); -#ifdef MALLOC_MAG - _malloc_message("Rounds per magazine: ", umax2s(max_rounds, s), - "\n", ""); -#endif - _malloc_message("Max dirty pages per arena: ", - umax2s(opt_dirty_max, s), "\n", ""); - - _malloc_message("Chunk size: ", umax2s(chunksize, s), "", ""); - _malloc_message(" (2^", umax2s(opt_chunk_2pow, s), ")\n", ""); - -#ifdef MALLOC_STATS - { - size_t allocated, mapped; -#ifdef MALLOC_BALANCE - uint64_t nbalance = 0; -#endif - unsigned i; - arena_t *arena; - - /* Calculate and print allocated/mapped stats. */ - - /* arenas. */ - for (i = 0, allocated = 0; i < narenas; i++) { - if (arenas[i] != NULL) { - malloc_spin_lock(&arenas[i]->lock); - allocated += - arenas[i]->stats.allocated_small; - allocated += - arenas[i]->stats.allocated_large; -#ifdef MALLOC_BALANCE - nbalance += arenas[i]->stats.nbalance; -#endif - malloc_spin_unlock(&arenas[i]->lock); - } - } - - /* huge/base. */ - malloc_mutex_lock(&huge_mtx); - allocated += huge_allocated; - mapped = stats_chunks.curchunks * chunksize; - malloc_mutex_unlock(&huge_mtx); - - malloc_mutex_lock(&base_mtx); - mapped += base_mapped; - malloc_mutex_unlock(&base_mtx); - - malloc_printf("Allocated: %zu, mapped: %zu\n", - allocated, mapped); - -#ifdef MALLOC_BALANCE - malloc_printf("Arena balance reassignments: %llu\n", - nbalance); -#endif - - /* Print chunk stats. */ - { - chunk_stats_t chunks_stats; - - malloc_mutex_lock(&huge_mtx); - chunks_stats = stats_chunks; - malloc_mutex_unlock(&huge_mtx); - - malloc_printf("chunks: nchunks " - "highchunks curchunks\n"); - malloc_printf(" %13llu%13lu%13lu\n", - chunks_stats.nchunks, - chunks_stats.highchunks, - chunks_stats.curchunks); - } - - /* Print chunk stats. */ - malloc_printf( - "huge: nmalloc ndalloc allocated\n"); - malloc_printf(" %12llu %12llu %12zu\n", - huge_nmalloc, huge_ndalloc, huge_allocated); - - /* Print stats for each arena. */ - for (i = 0; i < narenas; i++) { - arena = arenas[i]; - if (arena != NULL) { - malloc_printf( - "\narenas[%u]:\n", i); - malloc_spin_lock(&arena->lock); - stats_print(arena); - malloc_spin_unlock(&arena->lock); - } - } - } -#endif /* #ifdef MALLOC_STATS */ - _malloc_message("--- End malloc statistics ---\n", "", "", ""); - } -} - -#ifdef MALLOC_DEBUG -static void -size2bin_validate(void) -{ - size_t i, size, binind; - - assert(size2bin[0] == 0xffU); - i = 1; -# ifdef MALLOC_TINY - /* Tiny. */ - for (; i < (1U << TINY_MIN_2POW); i++) { - size = pow2_ceil(1U << TINY_MIN_2POW); - binind = ffs((int)(size >> (TINY_MIN_2POW + 1))); - assert(size2bin[i] == binind); - } - for (; i < qspace_min; i++) { - size = pow2_ceil(i); - binind = ffs((int)(size >> (TINY_MIN_2POW + 1))); - assert(size2bin[i] == binind); - } -# endif - /* Quantum-spaced. */ - for (; i <= qspace_max; i++) { - size = QUANTUM_CEILING(i); - binind = ntbins + (size >> QUANTUM_2POW) - 1; - assert(size2bin[i] == binind); - } - /* Cacheline-spaced. */ - for (; i <= cspace_max; i++) { - size = CACHELINE_CEILING(i); - binind = ntbins + nqbins + ((size - cspace_min) >> - CACHELINE_2POW); - assert(size2bin[i] == binind); - } - /* Sub-page. */ - for (; i <= sspace_max; i++) { - size = SUBPAGE_CEILING(i); - binind = ntbins + nqbins + ncbins + ((size - sspace_min) - >> SUBPAGE_2POW); - assert(size2bin[i] == binind); - } -} -#endif - -static bool -size2bin_init(void) -{ - - if (opt_qspace_max_2pow != QSPACE_MAX_2POW_DEFAULT - || opt_cspace_max_2pow != CSPACE_MAX_2POW_DEFAULT) - return (size2bin_init_hard()); - - size2bin = const_size2bin; -#ifdef MALLOC_DEBUG - assert(sizeof(const_size2bin) == bin_maxclass + 1); - size2bin_validate(); -#endif - return (false); -} - -static bool -size2bin_init_hard(void) -{ - size_t i, size, binind; - uint8_t *custom_size2bin; - - assert(opt_qspace_max_2pow != QSPACE_MAX_2POW_DEFAULT - || opt_cspace_max_2pow != CSPACE_MAX_2POW_DEFAULT); - - custom_size2bin = (uint8_t *)base_alloc(bin_maxclass + 1); - if (custom_size2bin == NULL) - return (true); - - custom_size2bin[0] = 0xffU; - i = 1; -#ifdef MALLOC_TINY - /* Tiny. */ - for (; i < (1U << TINY_MIN_2POW); i++) { - size = pow2_ceil(1U << TINY_MIN_2POW); - binind = ffs((int)(size >> (TINY_MIN_2POW + 1))); - custom_size2bin[i] = binind; - } - for (; i < qspace_min; i++) { - size = pow2_ceil(i); - binind = ffs((int)(size >> (TINY_MIN_2POW + 1))); - custom_size2bin[i] = binind; - } -#endif - /* Quantum-spaced. */ - for (; i <= qspace_max; i++) { - size = QUANTUM_CEILING(i); - binind = ntbins + (size >> QUANTUM_2POW) - 1; - custom_size2bin[i] = binind; - } - /* Cacheline-spaced. */ - for (; i <= cspace_max; i++) { - size = CACHELINE_CEILING(i); - binind = ntbins + nqbins + ((size - cspace_min) >> - CACHELINE_2POW); - custom_size2bin[i] = binind; - } - /* Sub-page. */ - for (; i <= sspace_max; i++) { - size = SUBPAGE_CEILING(i); - binind = ntbins + nqbins + ncbins + ((size - sspace_min) >> - SUBPAGE_2POW); - custom_size2bin[i] = binind; - } - - size2bin = custom_size2bin; -#ifdef MALLOC_DEBUG - size2bin_validate(); -#endif - return (false); -} - -/* - * FreeBSD's pthreads implementation calls malloc(3), so the malloc - * implementation has to take pains to avoid infinite recursion during - * initialization. - */ -static inline bool -malloc_init(void) -{ - - if (malloc_initialized == false) - return (malloc_init_hard()); - - return (false); -} - -static bool -malloc_init_hard(void) -{ - unsigned i; - int linklen; - char buf[PATH_MAX + 1]; - const char *opts; - - malloc_mutex_lock(&init_lock); - if (malloc_initialized) { - /* - * Another thread initialized the allocator before this one - * acquired init_lock. - */ - malloc_mutex_unlock(&init_lock); - return (false); - } - - /* Get number of CPUs. */ - { - int mib[2]; - size_t len; - - mib[0] = CTL_HW; - mib[1] = HW_NCPU; - len = sizeof(ncpus); - if (sysctl(mib, 2, &ncpus, &len, (void *) 0, 0) == -1) { - /* Error. */ - ncpus = 1; - } - } - - /* Get page size. */ - { - long result; - - result = sysconf(_SC_PAGESIZE); - assert(result != -1); - pagesize = (unsigned)result; - - /* - * We assume that pagesize is a power of 2 when calculating - * pagesize_mask and pagesize_2pow. - */ - assert(((result - 1) & result) == 0); - pagesize_mask = result - 1; - pagesize_2pow = ffs((int)result) - 1; - } - - for (i = 0; i < 3; i++) { - unsigned j; - - /* Get runtime configuration. */ - switch (i) { - case 0: - if ((linklen = readlink("/etc/malloc.conf", buf, - sizeof(buf) - 1)) != -1) { - /* - * Use the contents of the "/etc/malloc.conf" - * symbolic link's name. - */ - buf[linklen] = '\0'; - opts = buf; - } else { - /* No configuration specified. */ - buf[0] = '\0'; - opts = buf; - } - break; - case 1: - if (issetugid() == 0 && (opts = - getenv("MALLOC_OPTIONS")) != NULL) { - /* - * Do nothing; opts is already initialized to - * the value of the MALLOC_OPTIONS environment - * variable. - */ - } else { - /* No configuration specified. */ - buf[0] = '\0'; - opts = buf; - } - break; - case 2: - if (_malloc_options != NULL) { - /* - * Use options that were compiled into the - * program. - */ - opts = _malloc_options; - } else { - /* No configuration specified. */ - buf[0] = '\0'; - opts = buf; - } - break; - default: - /* NOTREACHED */ - assert(false); - } - - for (j = 0; opts[j] != '\0'; j++) { - unsigned k, nreps; - bool nseen; - - /* Parse repetition count, if any. */ - for (nreps = 0, nseen = false;; j++, nseen = true) { - switch (opts[j]) { - case '0': case '1': case '2': case '3': - case '4': case '5': case '6': case '7': - case '8': case '9': - nreps *= 10; - nreps += opts[j] - '0'; - break; - default: - goto MALLOC_OUT; - } - } -MALLOC_OUT: - if (nseen == false) - nreps = 1; - - for (k = 0; k < nreps; k++) { - switch (opts[j]) { - case 'a': - opt_abort = false; - break; - case 'A': - opt_abort = true; - break; - case 'b': -#ifdef MALLOC_BALANCE - opt_balance_threshold >>= 1; -#endif - break; - case 'B': -#ifdef MALLOC_BALANCE - if (opt_balance_threshold == 0) - opt_balance_threshold = 1; - else if ((opt_balance_threshold << 1) - > opt_balance_threshold) - opt_balance_threshold <<= 1; -#endif - break; - case 'c': - if (opt_cspace_max_2pow - 1 > - opt_qspace_max_2pow && - opt_cspace_max_2pow > - CACHELINE_2POW) - opt_cspace_max_2pow--; - break; - case 'C': - if (opt_cspace_max_2pow < pagesize_2pow - - 1) - opt_cspace_max_2pow++; - break; - case 'd': -#ifdef MALLOC_DSS - opt_dss = false; -#endif - break; - case 'D': -#ifdef MALLOC_DSS - opt_dss = true; -#endif - break; - case 'f': - opt_dirty_max >>= 1; - break; - case 'F': - if (opt_dirty_max == 0) - opt_dirty_max = 1; - else if ((opt_dirty_max << 1) != 0) - opt_dirty_max <<= 1; - break; -#ifdef MALLOC_MAG - case 'g': - opt_mag = false; - break; - case 'G': - opt_mag = true; - break; -#endif - case 'j': - opt_junk = false; - break; - case 'J': - opt_junk = true; - break; - case 'k': - /* - * Chunks always require at least one - * header page, so chunks can never be - * smaller than two pages. - */ - if (opt_chunk_2pow > pagesize_2pow + 1) - opt_chunk_2pow--; - break; - case 'K': - if (opt_chunk_2pow + 1 < - (sizeof(size_t) << 3)) - opt_chunk_2pow++; - break; - case 'm': -#ifdef MALLOC_DSS - opt_mmap = false; -#endif - break; - case 'M': -#ifdef MALLOC_DSS - opt_mmap = true; -#endif - break; - case 'n': - opt_narenas_lshift--; - break; - case 'N': - opt_narenas_lshift++; - break; - case 'p': - opt_print_stats = false; - break; - case 'P': - opt_print_stats = true; - break; - case 'q': - if (opt_qspace_max_2pow > QUANTUM_2POW) - opt_qspace_max_2pow--; - break; - case 'Q': - if (opt_qspace_max_2pow + 1 < - opt_cspace_max_2pow) - opt_qspace_max_2pow++; - break; -#ifdef MALLOC_MAG - case 'R': - if (opt_mag_size_2pow + 1 < (8U << - SIZEOF_PTR_2POW)) - opt_mag_size_2pow++; - break; - case 'r': - /* - * Make sure there's always at least - * one round per magazine. - */ - if ((1U << (opt_mag_size_2pow-1)) >= - sizeof(mag_t)) - opt_mag_size_2pow--; - break; -#endif - case 'u': - opt_utrace = false; - break; - case 'U': - opt_utrace = true; - break; - case 'v': - opt_sysv = false; - break; - case 'V': - opt_sysv = true; - break; - case 'x': - opt_xmalloc = false; - break; - case 'X': - opt_xmalloc = true; - break; - case 'z': - opt_zero = false; - break; - case 'Z': - opt_zero = true; - break; - default: { - char cbuf[2]; - - cbuf[0] = opts[j]; - cbuf[1] = '\0'; - _malloc_message(_getprogname(), - ": (malloc) Unsupported character " - "in malloc options: '", cbuf, - "'\n"); - } - } - } - } - } - -#ifdef MALLOC_DSS - /* Make sure that there is some method for acquiring memory. */ - if (opt_dss == false && opt_mmap == false) - opt_mmap = true; -#endif - - /* Take care to call atexit() only once. */ - if (opt_print_stats) { - /* Print statistics at exit. */ - atexit(malloc_print_stats); - } - -#ifdef MALLOC_MAG - /* - * Calculate the actual number of rounds per magazine, taking into - * account header overhead. - */ - max_rounds = (1LLU << (opt_mag_size_2pow - SIZEOF_PTR_2POW)) - - (sizeof(mag_t) >> SIZEOF_PTR_2POW) + 1; -#endif - - /* Set variables according to the value of opt_[qc]space_max_2pow. */ - qspace_max = (1U << opt_qspace_max_2pow); - cspace_min = CACHELINE_CEILING(qspace_max); - if (cspace_min == qspace_max) - cspace_min += CACHELINE; - cspace_max = (1U << opt_cspace_max_2pow); - sspace_min = SUBPAGE_CEILING(cspace_max); - if (sspace_min == cspace_max) - sspace_min += SUBPAGE; - assert(sspace_min < pagesize); - sspace_max = pagesize - SUBPAGE; - -#ifdef MALLOC_TINY - assert(QUANTUM_2POW >= TINY_MIN_2POW); -#endif - assert(ntbins <= QUANTUM_2POW); - nqbins = qspace_max >> QUANTUM_2POW; - ncbins = ((cspace_max - cspace_min) >> CACHELINE_2POW) + 1; - nsbins = ((sspace_max - sspace_min) >> SUBPAGE_2POW) + 1; - nbins = ntbins + nqbins + ncbins + nsbins; - - if (size2bin_init()) { - malloc_mutex_unlock(&init_lock); - return (true); - } - - /* Set variables according to the value of opt_chunk_2pow. */ - chunksize = (1LU << opt_chunk_2pow); - chunksize_mask = chunksize - 1; - chunk_npages = (chunksize >> pagesize_2pow); - { - size_t header_size; - - /* - * Compute the header size such that it is large enough to - * contain the page map. - */ - header_size = sizeof(arena_chunk_t) + - (sizeof(arena_chunk_map_t) * (chunk_npages - 1)); - arena_chunk_header_npages = (header_size >> pagesize_2pow) + - ((header_size & pagesize_mask) != 0); - } - arena_maxclass = chunksize - (arena_chunk_header_npages << - pagesize_2pow); - - UTRACE(0, 0, 0); - -#ifdef MALLOC_STATS - memset(&stats_chunks, 0, sizeof(chunk_stats_t)); -#endif - - /* Various sanity checks that regard configuration. */ - assert(chunksize >= pagesize); - - /* Initialize chunks data. */ - malloc_mutex_init(&huge_mtx); - extent_tree_ad_new(&huge); -#ifdef MALLOC_DSS - malloc_mutex_init(&dss_mtx); - dss_base = sbrk(0); - dss_prev = dss_base; - dss_max = dss_base; - extent_tree_szad_new(&dss_chunks_szad); - extent_tree_ad_new(&dss_chunks_ad); -#endif -#ifdef MALLOC_STATS - huge_nmalloc = 0; - huge_ndalloc = 0; - huge_allocated = 0; -#endif - - /* Initialize base allocation data structures. */ -#ifdef MALLOC_STATS - base_mapped = 0; -#endif -#ifdef MALLOC_DSS - /* - * Allocate a base chunk here, since it doesn't actually have to be - * chunk-aligned. Doing this before allocating any other chunks allows - * the use of space that would otherwise be wasted. - */ - if (opt_dss) - base_pages_alloc(0); -#endif - base_nodes = NULL; - malloc_mutex_init(&base_mtx); - - if (ncpus > 1) { - /* - * For SMP systems, create twice as many arenas as there are - * CPUs by default. - */ - opt_narenas_lshift++; - } - - /* Determine how many arenas to use. */ - narenas = ncpus; - if (opt_narenas_lshift > 0) { - if ((narenas << opt_narenas_lshift) > narenas) - narenas <<= opt_narenas_lshift; - /* - * Make sure not to exceed the limits of what base_alloc() can - * handle. - */ - if (narenas * sizeof(arena_t *) > chunksize) - narenas = chunksize / sizeof(arena_t *); - } else if (opt_narenas_lshift < 0) { - if ((narenas >> -opt_narenas_lshift) < narenas) - narenas >>= -opt_narenas_lshift; - /* Make sure there is at least one arena. */ - if (narenas == 0) - narenas = 1; - } -#ifdef MALLOC_BALANCE - assert(narenas != 0); - for (narenas_2pow = 0; - (narenas >> (narenas_2pow + 1)) != 0; - narenas_2pow++); -#endif - -#ifdef NO_TLS - if (narenas > 1) { - static const unsigned primes[] = {1, 3, 5, 7, 11, 13, 17, 19, - 23, 29, 31, 37, 41, 43, 47, 53, 59, 61, 67, 71, 73, 79, 83, - 89, 97, 101, 103, 107, 109, 113, 127, 131, 137, 139, 149, - 151, 157, 163, 167, 173, 179, 181, 191, 193, 197, 199, 211, - 223, 227, 229, 233, 239, 241, 251, 257, 263}; - unsigned nprimes, parenas; - - /* - * Pick a prime number of hash arenas that is more than narenas - * so that direct hashing of pthread_self() pointers tends to - * spread allocations evenly among the arenas. - */ - assert((narenas & 1) == 0); /* narenas must be even. */ - nprimes = (sizeof(primes) >> SIZEOF_INT_2POW); - parenas = primes[nprimes - 1]; /* In case not enough primes. */ - for (i = 1; i < nprimes; i++) { - if (primes[i] > narenas) { - parenas = primes[i]; - break; - } - } - narenas = parenas; - } -#endif - -#ifndef NO_TLS -# ifndef MALLOC_BALANCE - next_arena = 0; -# endif -#endif - - /* Allocate and initialize arenas. */ - arenas = (arena_t **)base_alloc(sizeof(arena_t *) * narenas); - if (arenas == NULL) { - malloc_mutex_unlock(&init_lock); - return (true); - } - /* - * Zero the array. In practice, this should always be pre-zeroed, - * since it was just mmap()ed, but let's be sure. - */ - memset(arenas, 0, sizeof(arena_t *) * narenas); - - /* - * Initialize one arena here. The rest are lazily created in - * choose_arena_hard(). - */ - arenas_extend(0); - if (arenas[0] == NULL) { - malloc_mutex_unlock(&init_lock); - return (true); - } -#ifndef NO_TLS - /* - * Assign the initial arena to the initial thread, in order to avoid - * spurious creation of an extra arena if the application switches to - * threaded mode. - */ - arenas_map = arenas[0]; -#endif - /* - * Seed here for the initial thread, since choose_arena_hard() is only - * called for other threads. The seed value doesn't really matter. - */ -#ifdef MALLOC_BALANCE - SPRN(balance, 42); -#endif - - malloc_spin_init(&arenas_lock); - - malloc_initialized = true; - malloc_mutex_unlock(&init_lock); - return (false); -} - -/* - * End general internal functions. - */ -/******************************************************************************/ -/* - * Begin malloc(3)-compatible functions. - */ - -void * -malloc(size_t size) -{ - void *ret; - - if (malloc_init()) { - ret = NULL; - goto RETURN; - } - - if (size == 0) { - if (opt_sysv == false) - size = 1; - else { - ret = NULL; - goto RETURN; - } - } - - ret = imalloc(size); - -RETURN: - if (ret == NULL) { - if (opt_xmalloc) { - _malloc_message(_getprogname(), - ": (malloc) Error in malloc(): out of memory\n", "", - ""); - abort(); - } - errno = ENOMEM; - } - - UTRACE(0, size, ret); - return (ret); -} - -int -posix_memalign(void **memptr, size_t alignment, size_t size) -{ - int ret; - void *result; - - if (malloc_init()) - result = NULL; - else { - /* Make sure that alignment is a large enough power of 2. */ - if (((alignment - 1) & alignment) != 0 - || alignment < sizeof(void *)) { - if (opt_xmalloc) { - _malloc_message(_getprogname(), - ": (malloc) Error in posix_memalign(): " - "invalid alignment\n", "", ""); - abort(); - } - result = NULL; - ret = EINVAL; - goto RETURN; - } - - result = ipalloc(alignment, size); - } - - if (result == NULL) { - if (opt_xmalloc) { - _malloc_message(_getprogname(), - ": (malloc) Error in posix_memalign(): out of memory\n", - "", ""); - abort(); - } - ret = ENOMEM; - goto RETURN; - } - - *memptr = result; - ret = 0; - -RETURN: - UTRACE(0, size, result); - return (ret); -} - -void * -calloc(size_t num, size_t size) -{ - void *ret; - size_t num_size; - - if (malloc_init()) { - num_size = 0; - ret = NULL; - goto RETURN; - } - - num_size = num * size; - if (num_size == 0) { - if ((opt_sysv == false) && ((num == 0) || (size == 0))) - num_size = 1; - else { - ret = NULL; - goto RETURN; - } - /* - * Try to avoid division here. We know that it isn't possible to - * overflow during multiplication if neither operand uses any of the - * most significant half of the bits in a size_t. - */ - } else if (((num | size) & (SIZE_T_MAX << (sizeof(size_t) << 2))) - && (num_size / size != num)) { - /* size_t overflow. */ - ret = NULL; - goto RETURN; - } - - ret = icalloc(num_size); - -RETURN: - if (ret == NULL) { - if (opt_xmalloc) { - _malloc_message(_getprogname(), - ": (malloc) Error in calloc(): out of memory\n", "", - ""); - abort(); - } - errno = ENOMEM; - } - - UTRACE(0, num_size, ret); - return (ret); -} - -void * -realloc(void *ptr, size_t size) -{ - void *ret; - - if (size == 0) { - if (opt_sysv == false) - size = 1; - else { - if (ptr != NULL) - idalloc(ptr); - ret = NULL; - goto RETURN; - } - } - - if (ptr != NULL) { - assert(malloc_initialized); - - ret = iralloc(ptr, size); - - if (ret == NULL) { - if (opt_xmalloc) { - _malloc_message(_getprogname(), - ": (malloc) Error in realloc(): out of " - "memory\n", "", ""); - abort(); - } - errno = ENOMEM; - } - } else { - if (malloc_init()) - ret = NULL; - else - ret = imalloc(size); - - if (ret == NULL) { - if (opt_xmalloc) { - _malloc_message(_getprogname(), - ": (malloc) Error in realloc(): out of " - "memory\n", "", ""); - abort(); - } - errno = ENOMEM; - } - } - -RETURN: - UTRACE(ptr, size, ret); - return (ret); -} - -void -free(void *ptr) -{ - - UTRACE(ptr, 0, 0); - if (ptr != NULL) { - assert(malloc_initialized); - - idalloc(ptr); - } -} - -/* - * End malloc(3)-compatible functions. - */ -/******************************************************************************/ -/* - * Begin non-standard functions. - */ - -size_t -malloc_usable_size(const void *ptr) -{ - - assert(ptr != NULL); - - return (isalloc(ptr)); -} - -/* - * End non-standard functions. - */ -/******************************************************************************/ -/* - * Begin library-private functions. - */ - -/******************************************************************************/ -/* - * Begin thread cache. - */ - -/* - * We provide an unpublished interface in order to receive notifications from - * the pthreads library whenever a thread exits. This allows us to clean up - * thread caches. - */ -void -_malloc_thread_cleanup(void) -{ - -#ifdef MALLOC_MAG - if (mag_rack != NULL) { - assert(mag_rack != (void *)-1); - mag_rack_destroy(mag_rack); -#ifdef MALLOC_DEBUG - mag_rack = (void *)-1; -#endif - } -#endif -} - -/* - * The following functions are used by threading libraries for protection of - * malloc during fork(). These functions are only called if the program is - * running in threaded mode, so there is no need to check whether the program - * is threaded here. - */ - -void -_malloc_prefork(void) -{ - unsigned i; - - /* Acquire all mutexes in a safe order. */ - - malloc_spin_lock(&arenas_lock); - for (i = 0; i < narenas; i++) { - if (arenas[i] != NULL) - malloc_spin_lock(&arenas[i]->lock); - } - malloc_spin_unlock(&arenas_lock); - - malloc_mutex_lock(&base_mtx); - - malloc_mutex_lock(&huge_mtx); - -#ifdef MALLOC_DSS - malloc_mutex_lock(&dss_mtx); -#endif -} - -void -_malloc_postfork(void) -{ - unsigned i; - - /* Release all mutexes, now that fork() has completed. */ - -#ifdef MALLOC_DSS - malloc_mutex_unlock(&dss_mtx); -#endif - - malloc_mutex_unlock(&huge_mtx); - - malloc_mutex_unlock(&base_mtx); - - malloc_spin_lock(&arenas_lock); - for (i = 0; i < narenas; i++) { - if (arenas[i] != NULL) - malloc_spin_unlock(&arenas[i]->lock); - } - malloc_spin_unlock(&arenas_lock); -} - -/* - * End library-private functions. - */ -/******************************************************************************/ diff --git a/lib/libjemalloc/rb.h b/lib/libjemalloc/rb.h deleted file mode 100644 index acfe203..0000000 --- a/lib/libjemalloc/rb.h +++ /dev/null @@ -1,946 +0,0 @@ -/****************************************************************************** - * - * Copyright (C) 2008 Jason Evans . - * All rights reserved. - * - * Redistribution and use in source and binary forms, with or without - * modification, are permitted provided that the following conditions - * are met: - * 1. Redistributions of source code must retain the above copyright - * notice(s), this list of conditions and the following disclaimer - * unmodified other than the allowable addition of one or more - * copyright notices. - * 2. Redistributions in binary form must reproduce the above copyright - * notice(s), this list of conditions and the following disclaimer in - * the documentation and/or other materials provided with the - * distribution. - * - * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDER(S) ``AS IS'' AND ANY - * EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE - * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR - * PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER(S) BE - * LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR - * CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF - * SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR - * BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, - * WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE - * OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, - * EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. - * - ****************************************************************************** - * - * cpp macro implementation of left-leaning red-black trees. - * - * Usage: - * - * (Optional, see assert(3).) - * #define NDEBUG - * - * (Required.) - * #include - * #include - * ... - * - * All operations are done non-recursively. Parent pointers are not used, and - * color bits are stored in the least significant bit of right-child pointers, - * thus making node linkage as compact as is possible for red-black trees. - * - * Some macros use a comparison function pointer, which is expected to have the - * following prototype: - * - * int (a_cmp *)(a_type *a_node, a_type *a_other); - * ^^^^^^ - * or a_key - * - * Interpretation of comparision function return values: - * - * -1 : a_node < a_other - * 0 : a_node == a_other - * 1 : a_node > a_other - * - * In all cases, the a_node or a_key macro argument is the first argument to the - * comparison function, which makes it possible to write comparison functions - * that treat the first argument specially. - * - ******************************************************************************/ - -#ifndef RB_H_ -#define RB_H_ - -//__FBSDID("$FreeBSD: head/lib/libc/stdlib/rb.h 178995 2008-05-14 18:33:13Z jasone $"); - -/* Node structure. */ -#define rb_node(a_type) \ -struct { \ - a_type *rbn_left; \ - a_type *rbn_right_red; \ -} - -/* Root structure. */ -#define rb_tree(a_type) \ -struct { \ - a_type *rbt_root; \ - a_type rbt_nil; \ -} - -/* Left accessors. */ -#define rbp_left_get(a_type, a_field, a_node) \ - ((a_node)->a_field.rbn_left) -#define rbp_left_set(a_type, a_field, a_node, a_left) do { \ - (a_node)->a_field.rbn_left = a_left; \ -} while (0) - -/* Right accessors. */ -#define rbp_right_get(a_type, a_field, a_node) \ - ((a_type *) (((intptr_t) (a_node)->a_field.rbn_right_red) \ - & ((ssize_t)-2))) -#define rbp_right_set(a_type, a_field, a_node, a_right) do { \ - (a_node)->a_field.rbn_right_red = (a_type *) (((uintptr_t) a_right) \ - | (((uintptr_t) (a_node)->a_field.rbn_right_red) & ((size_t)1))); \ -} while (0) - -/* Color accessors. */ -#define rbp_red_get(a_type, a_field, a_node) \ - ((bool) (((uintptr_t) (a_node)->a_field.rbn_right_red) \ - & ((size_t)1))) -#define rbp_color_set(a_type, a_field, a_node, a_red) do { \ - (a_node)->a_field.rbn_right_red = (a_type *) ((((intptr_t) \ - (a_node)->a_field.rbn_right_red) & ((ssize_t)-2)) \ - | ((ssize_t)a_red)); \ -} while (0) -#define rbp_red_set(a_type, a_field, a_node) do { \ - (a_node)->a_field.rbn_right_red = (a_type *) (((uintptr_t) \ - (a_node)->a_field.rbn_right_red) | ((size_t)1)); \ -} while (0) -#define rbp_black_set(a_type, a_field, a_node) do { \ - (a_node)->a_field.rbn_right_red = (a_type *) (((intptr_t) \ - (a_node)->a_field.rbn_right_red) & ((ssize_t)-2)); \ -} while (0) - -/* Node initializer. */ -#define rbp_node_new(a_type, a_field, a_tree, a_node) do { \ - rbp_left_set(a_type, a_field, (a_node), &(a_tree)->rbt_nil); \ - rbp_right_set(a_type, a_field, (a_node), &(a_tree)->rbt_nil); \ - rbp_red_set(a_type, a_field, (a_node)); \ -} while (0) - -/* Tree initializer. */ -#define rb_new(a_type, a_field, a_tree) do { \ - (a_tree)->rbt_root = &(a_tree)->rbt_nil; \ - rbp_node_new(a_type, a_field, a_tree, &(a_tree)->rbt_nil); \ - rbp_black_set(a_type, a_field, &(a_tree)->rbt_nil); \ -} while (0) - -/* Tree operations. */ -#define rbp_black_height(a_type, a_field, a_tree, r_height) do { \ - a_type *rbp_bh_t; \ - for (rbp_bh_t = (a_tree)->rbt_root, (r_height) = 0; \ - rbp_bh_t != &(a_tree)->rbt_nil; \ - rbp_bh_t = rbp_left_get(a_type, a_field, rbp_bh_t)) { \ - if (rbp_red_get(a_type, a_field, rbp_bh_t) == false) { \ - (r_height)++; \ - } \ - } \ -} while (0) - -#define rbp_first(a_type, a_field, a_tree, a_root, r_node) do { \ - for ((r_node) = (a_root); \ - rbp_left_get(a_type, a_field, (r_node)) != &(a_tree)->rbt_nil; \ - (r_node) = rbp_left_get(a_type, a_field, (r_node))) { \ - } \ -} while (0) - -#define rbp_last(a_type, a_field, a_tree, a_root, r_node) do { \ - for ((r_node) = (a_root); \ - rbp_right_get(a_type, a_field, (r_node)) != &(a_tree)->rbt_nil; \ - (r_node) = rbp_right_get(a_type, a_field, (r_node))) { \ - } \ -} while (0) - -#define rbp_next(a_type, a_field, a_cmp, a_tree, a_node, r_node) do { \ - if (rbp_right_get(a_type, a_field, (a_node)) \ - != &(a_tree)->rbt_nil) { \ - rbp_first(a_type, a_field, a_tree, rbp_right_get(a_type, \ - a_field, (a_node)), (r_node)); \ - } else { \ - a_type *rbp_n_t = (a_tree)->rbt_root; \ - assert(rbp_n_t != &(a_tree)->rbt_nil); \ - (r_node) = &(a_tree)->rbt_nil; \ - while (true) { \ - int rbp_n_cmp = (a_cmp)((a_node), rbp_n_t); \ - if (rbp_n_cmp < 0) { \ - (r_node) = rbp_n_t; \ - rbp_n_t = rbp_left_get(a_type, a_field, rbp_n_t); \ - } else if (rbp_n_cmp > 0) { \ - rbp_n_t = rbp_right_get(a_type, a_field, rbp_n_t); \ - } else { \ - break; \ - } \ - assert(rbp_n_t != &(a_tree)->rbt_nil); \ - } \ - } \ -} while (0) - -#define rbp_prev(a_type, a_field, a_cmp, a_tree, a_node, r_node) do { \ - if (rbp_left_get(a_type, a_field, (a_node)) != &(a_tree)->rbt_nil) {\ - rbp_last(a_type, a_field, a_tree, rbp_left_get(a_type, \ - a_field, (a_node)), (r_node)); \ - } else { \ - a_type *rbp_p_t = (a_tree)->rbt_root; \ - assert(rbp_p_t != &(a_tree)->rbt_nil); \ - (r_node) = &(a_tree)->rbt_nil; \ - while (true) { \ - int rbp_p_cmp = (a_cmp)((a_node), rbp_p_t); \ - if (rbp_p_cmp < 0) { \ - rbp_p_t = rbp_left_get(a_type, a_field, rbp_p_t); \ - } else if (rbp_p_cmp > 0) { \ - (r_node) = rbp_p_t; \ - rbp_p_t = rbp_right_get(a_type, a_field, rbp_p_t); \ - } else { \ - break; \ - } \ - assert(rbp_p_t != &(a_tree)->rbt_nil); \ - } \ - } \ -} while (0) - -#define rb_first(a_type, a_field, a_tree, r_node) do { \ - rbp_first(a_type, a_field, a_tree, (a_tree)->rbt_root, (r_node)); \ - if ((r_node) == &(a_tree)->rbt_nil) { \ - (r_node) = NULL; \ - } \ -} while (0) - -#define rb_last(a_type, a_field, a_tree, r_node) do { \ - rbp_last(a_type, a_field, a_tree, (a_tree)->rbt_root, r_node); \ - if ((r_node) == &(a_tree)->rbt_nil) { \ - (r_node) = NULL; \ - } \ -} while (0) - -#define rb_next(a_type, a_field, a_cmp, a_tree, a_node, r_node) do { \ - rbp_next(a_type, a_field, a_cmp, a_tree, (a_node), (r_node)); \ - if ((r_node) == &(a_tree)->rbt_nil) { \ - (r_node) = NULL; \ - } \ -} while (0) - -#define rb_prev(a_type, a_field, a_cmp, a_tree, a_node, r_node) do { \ - rbp_prev(a_type, a_field, a_cmp, a_tree, (a_node), (r_node)); \ - if ((r_node) == &(a_tree)->rbt_nil) { \ - (r_node) = NULL; \ - } \ -} while (0) - -#define rb_search(a_type, a_field, a_cmp, a_tree, a_key, r_node) do { \ - int rbp_se_cmp; \ - (r_node) = (a_tree)->rbt_root; \ - while ((r_node) != &(a_tree)->rbt_nil \ - && (rbp_se_cmp = (a_cmp)((a_key), (r_node))) != 0) { \ - if (rbp_se_cmp < 0) { \ - (r_node) = rbp_left_get(a_type, a_field, (r_node)); \ - } else { \ - (r_node) = rbp_right_get(a_type, a_field, (r_node)); \ - } \ - } \ - if ((r_node) == &(a_tree)->rbt_nil) { \ - (r_node) = NULL; \ - } \ -} while (0) - -/* - * Find a match if it exists. Otherwise, find the next greater node, if one - * exists. - */ -#define rb_nsearch(a_type, a_field, a_cmp, a_tree, a_key, r_node) do { \ - a_type *rbp_ns_t = (a_tree)->rbt_root; \ - (r_node) = NULL; \ - while (rbp_ns_t != &(a_tree)->rbt_nil) { \ - int rbp_ns_cmp = (a_cmp)((a_key), rbp_ns_t); \ - if (rbp_ns_cmp < 0) { \ - (r_node) = rbp_ns_t; \ - rbp_ns_t = rbp_left_get(a_type, a_field, rbp_ns_t); \ - } else if (rbp_ns_cmp > 0) { \ - rbp_ns_t = rbp_right_get(a_type, a_field, rbp_ns_t); \ - } else { \ - (r_node) = rbp_ns_t; \ - break; \ - } \ - } \ -} while (0) - -/* - * Find a match if it exists. Otherwise, find the previous lesser node, if one - * exists. - */ -#define rb_psearch(a_type, a_field, a_cmp, a_tree, a_key, r_node) do { \ - a_type *rbp_ps_t = (a_tree)->rbt_root; \ - (r_node) = NULL; \ - while (rbp_ps_t != &(a_tree)->rbt_nil) { \ - int rbp_ps_cmp = (a_cmp)((a_key), rbp_ps_t); \ - if (rbp_ps_cmp < 0) { \ - rbp_ps_t = rbp_left_get(a_type, a_field, rbp_ps_t); \ - } else if (rbp_ps_cmp > 0) { \ - (r_node) = rbp_ps_t; \ - rbp_ps_t = rbp_right_get(a_type, a_field, rbp_ps_t); \ - } else { \ - (r_node) = rbp_ps_t; \ - break; \ - } \ - } \ -} while (0) - -#define rbp_rotate_left(a_type, a_field, a_node, r_node) do { \ - (r_node) = rbp_right_get(a_type, a_field, (a_node)); \ - rbp_right_set(a_type, a_field, (a_node), \ - rbp_left_get(a_type, a_field, (r_node))); \ - rbp_left_set(a_type, a_field, (r_node), (a_node)); \ -} while (0) - -#define rbp_rotate_right(a_type, a_field, a_node, r_node) do { \ - (r_node) = rbp_left_get(a_type, a_field, (a_node)); \ - rbp_left_set(a_type, a_field, (a_node), \ - rbp_right_get(a_type, a_field, (r_node))); \ - rbp_right_set(a_type, a_field, (r_node), (a_node)); \ -} while (0) - -#define rbp_lean_left(a_type, a_field, a_node, r_node) do { \ - bool rbp_ll_red; \ - rbp_rotate_left(a_type, a_field, (a_node), (r_node)); \ - rbp_ll_red = rbp_red_get(a_type, a_field, (a_node)); \ - rbp_color_set(a_type, a_field, (r_node), rbp_ll_red); \ - rbp_red_set(a_type, a_field, (a_node)); \ -} while (0) - -#define rbp_lean_right(a_type, a_field, a_node, r_node) do { \ - bool rbp_lr_red; \ - rbp_rotate_right(a_type, a_field, (a_node), (r_node)); \ - rbp_lr_red = rbp_red_get(a_type, a_field, (a_node)); \ - rbp_color_set(a_type, a_field, (r_node), rbp_lr_red); \ - rbp_red_set(a_type, a_field, (a_node)); \ -} while (0) - -#define rbp_move_red_left(a_type, a_field, a_node, r_node) do { \ - a_type *rbp_mrl_t, *rbp_mrl_u; \ - rbp_mrl_t = rbp_left_get(a_type, a_field, (a_node)); \ - rbp_red_set(a_type, a_field, rbp_mrl_t); \ - rbp_mrl_t = rbp_right_get(a_type, a_field, (a_node)); \ - rbp_mrl_u = rbp_left_get(a_type, a_field, rbp_mrl_t); \ - if (rbp_red_get(a_type, a_field, rbp_mrl_u)) { \ - rbp_rotate_right(a_type, a_field, rbp_mrl_t, rbp_mrl_u); \ - rbp_right_set(a_type, a_field, (a_node), rbp_mrl_u); \ - rbp_rotate_left(a_type, a_field, (a_node), (r_node)); \ - rbp_mrl_t = rbp_right_get(a_type, a_field, (a_node)); \ - if (rbp_red_get(a_type, a_field, rbp_mrl_t)) { \ - rbp_black_set(a_type, a_field, rbp_mrl_t); \ - rbp_red_set(a_type, a_field, (a_node)); \ - rbp_rotate_left(a_type, a_field, (a_node), rbp_mrl_t); \ - rbp_left_set(a_type, a_field, (r_node), rbp_mrl_t); \ - } else { \ - rbp_black_set(a_type, a_field, (a_node)); \ - } \ - } else { \ - rbp_red_set(a_type, a_field, (a_node)); \ - rbp_rotate_left(a_type, a_field, (a_node), (r_node)); \ - } \ -} while (0) - -#define rbp_move_red_right(a_type, a_field, a_node, r_node) do { \ - a_type *rbp_mrr_t; \ - rbp_mrr_t = rbp_left_get(a_type, a_field, (a_node)); \ - if (rbp_red_get(a_type, a_field, rbp_mrr_t)) { \ - a_type *rbp_mrr_u, *rbp_mrr_v; \ - rbp_mrr_u = rbp_right_get(a_type, a_field, rbp_mrr_t); \ - rbp_mrr_v = rbp_left_get(a_type, a_field, rbp_mrr_u); \ - if (rbp_red_get(a_type, a_field, rbp_mrr_v)) { \ - rbp_color_set(a_type, a_field, rbp_mrr_u, \ - rbp_red_get(a_type, a_field, (a_node))); \ - rbp_black_set(a_type, a_field, rbp_mrr_v); \ - rbp_rotate_left(a_type, a_field, rbp_mrr_t, rbp_mrr_u); \ - rbp_left_set(a_type, a_field, (a_node), rbp_mrr_u); \ - rbp_rotate_right(a_type, a_field, (a_node), (r_node)); \ - rbp_rotate_left(a_type, a_field, (a_node), rbp_mrr_t); \ - rbp_right_set(a_type, a_field, (r_node), rbp_mrr_t); \ - } else { \ - rbp_color_set(a_type, a_field, rbp_mrr_t, \ - rbp_red_get(a_type, a_field, (a_node))); \ - rbp_red_set(a_type, a_field, rbp_mrr_u); \ - rbp_rotate_right(a_type, a_field, (a_node), (r_node)); \ - rbp_rotate_left(a_type, a_field, (a_node), rbp_mrr_t); \ - rbp_right_set(a_type, a_field, (r_node), rbp_mrr_t); \ - } \ - rbp_red_set(a_type, a_field, (a_node)); \ - } else { \ - rbp_red_set(a_type, a_field, rbp_mrr_t); \ - rbp_mrr_t = rbp_left_get(a_type, a_field, rbp_mrr_t); \ - if (rbp_red_get(a_type, a_field, rbp_mrr_t)) { \ - rbp_black_set(a_type, a_field, rbp_mrr_t); \ - rbp_rotate_right(a_type, a_field, (a_node), (r_node)); \ - rbp_rotate_left(a_type, a_field, (a_node), rbp_mrr_t); \ - rbp_right_set(a_type, a_field, (r_node), rbp_mrr_t); \ - } else { \ - rbp_rotate_left(a_type, a_field, (a_node), (r_node)); \ - } \ - } \ -} while (0) - -#define rb_insert(a_type, a_field, a_cmp, a_tree, a_node) do { \ - a_type rbp_i_s; \ - a_type *rbp_i_g, *rbp_i_p, *rbp_i_c, *rbp_i_t, *rbp_i_u; \ - int rbp_i_cmp = 0; \ - rbp_i_g = &(a_tree)->rbt_nil; \ - rbp_left_set(a_type, a_field, &rbp_i_s, (a_tree)->rbt_root); \ - rbp_right_set(a_type, a_field, &rbp_i_s, &(a_tree)->rbt_nil); \ - rbp_black_set(a_type, a_field, &rbp_i_s); \ - rbp_i_p = &rbp_i_s; \ - rbp_i_c = (a_tree)->rbt_root; \ - /* Iteratively search down the tree for the insertion point, */\ - /* splitting 4-nodes as they are encountered. At the end of each */\ - /* iteration, rbp_i_g->rbp_i_p->rbp_i_c is a 3-level path down */\ - /* the tree, assuming a sufficiently deep tree. */\ - while (rbp_i_c != &(a_tree)->rbt_nil) { \ - rbp_i_t = rbp_left_get(a_type, a_field, rbp_i_c); \ - rbp_i_u = rbp_left_get(a_type, a_field, rbp_i_t); \ - if (rbp_red_get(a_type, a_field, rbp_i_t) \ - && rbp_red_get(a_type, a_field, rbp_i_u)) { \ - /* rbp_i_c is the top of a logical 4-node, so split it. */\ - /* This iteration does not move down the tree, due to the */\ - /* disruptiveness of node splitting. */\ - /* */\ - /* Rotate right. */\ - rbp_rotate_right(a_type, a_field, rbp_i_c, rbp_i_t); \ - /* Pass red links up one level. */\ - rbp_i_u = rbp_left_get(a_type, a_field, rbp_i_t); \ - rbp_black_set(a_type, a_field, rbp_i_u); \ - if (rbp_left_get(a_type, a_field, rbp_i_p) == rbp_i_c) { \ - rbp_left_set(a_type, a_field, rbp_i_p, rbp_i_t); \ - rbp_i_c = rbp_i_t; \ - } else { \ - /* rbp_i_c was the right child of rbp_i_p, so rotate */\ - /* left in order to maintain the left-leaning */\ - /* invariant. */\ - assert(rbp_right_get(a_type, a_field, rbp_i_p) \ - == rbp_i_c); \ - rbp_right_set(a_type, a_field, rbp_i_p, rbp_i_t); \ - rbp_lean_left(a_type, a_field, rbp_i_p, rbp_i_u); \ - if (rbp_left_get(a_type, a_field, rbp_i_g) == rbp_i_p) {\ - rbp_left_set(a_type, a_field, rbp_i_g, rbp_i_u); \ - } else { \ - assert(rbp_right_get(a_type, a_field, rbp_i_g) \ - == rbp_i_p); \ - rbp_right_set(a_type, a_field, rbp_i_g, rbp_i_u); \ - } \ - rbp_i_p = rbp_i_u; \ - rbp_i_cmp = (a_cmp)((a_node), rbp_i_p); \ - if (rbp_i_cmp < 0) { \ - rbp_i_c = rbp_left_get(a_type, a_field, rbp_i_p); \ - } else { \ - assert(rbp_i_cmp > 0); \ - rbp_i_c = rbp_right_get(a_type, a_field, rbp_i_p); \ - } \ - continue; \ - } \ - } \ - rbp_i_g = rbp_i_p; \ - rbp_i_p = rbp_i_c; \ - rbp_i_cmp = (a_cmp)((a_node), rbp_i_c); \ - if (rbp_i_cmp < 0) { \ - rbp_i_c = rbp_left_get(a_type, a_field, rbp_i_c); \ - } else { \ - assert(rbp_i_cmp > 0); \ - rbp_i_c = rbp_right_get(a_type, a_field, rbp_i_c); \ - } \ - } \ - /* rbp_i_p now refers to the node under which to insert. */\ - rbp_node_new(a_type, a_field, a_tree, (a_node)); \ - if (rbp_i_cmp > 0) { \ - rbp_right_set(a_type, a_field, rbp_i_p, (a_node)); \ - rbp_lean_left(a_type, a_field, rbp_i_p, rbp_i_t); \ - if (rbp_left_get(a_type, a_field, rbp_i_g) == rbp_i_p) { \ - rbp_left_set(a_type, a_field, rbp_i_g, rbp_i_t); \ - } else if (rbp_right_get(a_type, a_field, rbp_i_g) == rbp_i_p) {\ - rbp_right_set(a_type, a_field, rbp_i_g, rbp_i_t); \ - } \ - } else { \ - rbp_left_set(a_type, a_field, rbp_i_p, (a_node)); \ - } \ - /* Update the root and make sure that it is black. */\ - (a_tree)->rbt_root = rbp_left_get(a_type, a_field, &rbp_i_s); \ - rbp_black_set(a_type, a_field, (a_tree)->rbt_root); \ -} while (0) - -#define rb_remove(a_type, a_field, a_cmp, a_tree, a_node) do { \ - a_type rbp_r_s; \ - a_type *rbp_r_p, *rbp_r_c, *rbp_r_xp, *rbp_r_t, *rbp_r_u; \ - int rbp_r_cmp; \ - rbp_left_set(a_type, a_field, &rbp_r_s, (a_tree)->rbt_root); \ - rbp_right_set(a_type, a_field, &rbp_r_s, &(a_tree)->rbt_nil); \ - rbp_black_set(a_type, a_field, &rbp_r_s); \ - rbp_r_p = &rbp_r_s; \ - rbp_r_c = (a_tree)->rbt_root; \ - rbp_r_xp = &(a_tree)->rbt_nil; \ - /* Iterate down the tree, but always transform 2-nodes to 3- or */\ - /* 4-nodes in order to maintain the invariant that the current */\ - /* node is not a 2-node. This allows simple deletion once a leaf */\ - /* is reached. Handle the root specially though, since there may */\ - /* be no way to convert it from a 2-node to a 3-node. */\ - rbp_r_cmp = (a_cmp)((a_node), rbp_r_c); \ - if (rbp_r_cmp < 0) { \ - rbp_r_t = rbp_left_get(a_type, a_field, rbp_r_c); \ - rbp_r_u = rbp_left_get(a_type, a_field, rbp_r_t); \ - if (rbp_red_get(a_type, a_field, rbp_r_t) == false \ - && rbp_red_get(a_type, a_field, rbp_r_u) == false) { \ - /* Apply standard transform to prepare for left move. */\ - rbp_move_red_left(a_type, a_field, rbp_r_c, rbp_r_t); \ - rbp_black_set(a_type, a_field, rbp_r_t); \ - rbp_left_set(a_type, a_field, rbp_r_p, rbp_r_t); \ - rbp_r_c = rbp_r_t; \ - } else { \ - /* Move left. */\ - rbp_r_p = rbp_r_c; \ - rbp_r_c = rbp_left_get(a_type, a_field, rbp_r_c); \ - } \ - } else { \ - if (rbp_r_cmp == 0) { \ - assert((a_node) == rbp_r_c); \ - if (rbp_right_get(a_type, a_field, rbp_r_c) \ - == &(a_tree)->rbt_nil) { \ - /* Delete root node (which is also a leaf node). */\ - if (rbp_left_get(a_type, a_field, rbp_r_c) \ - != &(a_tree)->rbt_nil) { \ - rbp_lean_right(a_type, a_field, rbp_r_c, rbp_r_t); \ - rbp_right_set(a_type, a_field, rbp_r_t, \ - &(a_tree)->rbt_nil); \ - } else { \ - rbp_r_t = &(a_tree)->rbt_nil; \ - } \ - rbp_left_set(a_type, a_field, rbp_r_p, rbp_r_t); \ - } else { \ - /* This is the node we want to delete, but we will */\ - /* instead swap it with its successor and delete the */\ - /* successor. Record enough information to do the */\ - /* swap later. rbp_r_xp is the a_node's parent. */\ - rbp_r_xp = rbp_r_p; \ - rbp_r_cmp = 1; /* Note that deletion is incomplete. */\ - } \ - } \ - if (rbp_r_cmp == 1) { \ - if (rbp_red_get(a_type, a_field, rbp_left_get(a_type, \ - a_field, rbp_right_get(a_type, a_field, rbp_r_c))) \ - == false) { \ - rbp_r_t = rbp_left_get(a_type, a_field, rbp_r_c); \ - if (rbp_red_get(a_type, a_field, rbp_r_t)) { \ - /* Standard transform. */\ - rbp_move_red_right(a_type, a_field, rbp_r_c, \ - rbp_r_t); \ - } else { \ - /* Root-specific transform. */\ - rbp_red_set(a_type, a_field, rbp_r_c); \ - rbp_r_u = rbp_left_get(a_type, a_field, rbp_r_t); \ - if (rbp_red_get(a_type, a_field, rbp_r_u)) { \ - rbp_black_set(a_type, a_field, rbp_r_u); \ - rbp_rotate_right(a_type, a_field, rbp_r_c, \ - rbp_r_t); \ - rbp_rotate_left(a_type, a_field, rbp_r_c, \ - rbp_r_u); \ - rbp_right_set(a_type, a_field, rbp_r_t, \ - rbp_r_u); \ - } else { \ - rbp_red_set(a_type, a_field, rbp_r_t); \ - rbp_rotate_left(a_type, a_field, rbp_r_c, \ - rbp_r_t); \ - } \ - } \ - rbp_left_set(a_type, a_field, rbp_r_p, rbp_r_t); \ - rbp_r_c = rbp_r_t; \ - } else { \ - /* Move right. */\ - rbp_r_p = rbp_r_c; \ - rbp_r_c = rbp_right_get(a_type, a_field, rbp_r_c); \ - } \ - } \ - } \ - if (rbp_r_cmp != 0) { \ - while (true) { \ - assert(rbp_r_p != &(a_tree)->rbt_nil); \ - rbp_r_cmp = (a_cmp)((a_node), rbp_r_c); \ - if (rbp_r_cmp < 0) { \ - rbp_r_t = rbp_left_get(a_type, a_field, rbp_r_c); \ - if (rbp_r_t == &(a_tree)->rbt_nil) { \ - /* rbp_r_c now refers to the successor node to */\ - /* relocate, and rbp_r_xp/a_node refer to the */\ - /* context for the relocation. */\ - if (rbp_left_get(a_type, a_field, rbp_r_xp) \ - == (a_node)) { \ - rbp_left_set(a_type, a_field, rbp_r_xp, \ - rbp_r_c); \ - } else { \ - assert(rbp_right_get(a_type, a_field, \ - rbp_r_xp) == (a_node)); \ - rbp_right_set(a_type, a_field, rbp_r_xp, \ - rbp_r_c); \ - } \ - rbp_left_set(a_type, a_field, rbp_r_c, \ - rbp_left_get(a_type, a_field, (a_node))); \ - rbp_right_set(a_type, a_field, rbp_r_c, \ - rbp_right_get(a_type, a_field, (a_node))); \ - rbp_color_set(a_type, a_field, rbp_r_c, \ - rbp_red_get(a_type, a_field, (a_node))); \ - if (rbp_left_get(a_type, a_field, rbp_r_p) \ - == rbp_r_c) { \ - rbp_left_set(a_type, a_field, rbp_r_p, \ - &(a_tree)->rbt_nil); \ - } else { \ - assert(rbp_right_get(a_type, a_field, rbp_r_p) \ - == rbp_r_c); \ - rbp_right_set(a_type, a_field, rbp_r_p, \ - &(a_tree)->rbt_nil); \ - } \ - break; \ - } \ - rbp_r_u = rbp_left_get(a_type, a_field, rbp_r_t); \ - if (rbp_red_get(a_type, a_field, rbp_r_t) == false \ - && rbp_red_get(a_type, a_field, rbp_r_u) == false) { \ - rbp_move_red_left(a_type, a_field, rbp_r_c, \ - rbp_r_t); \ - if (rbp_left_get(a_type, a_field, rbp_r_p) \ - == rbp_r_c) { \ - rbp_left_set(a_type, a_field, rbp_r_p, rbp_r_t);\ - } else { \ - rbp_right_set(a_type, a_field, rbp_r_p, \ - rbp_r_t); \ - } \ - rbp_r_c = rbp_r_t; \ - } else { \ - rbp_r_p = rbp_r_c; \ - rbp_r_c = rbp_left_get(a_type, a_field, rbp_r_c); \ - } \ - } else { \ - /* Check whether to delete this node (it has to be */\ - /* the correct node and a leaf node). */\ - if (rbp_r_cmp == 0) { \ - assert((a_node) == rbp_r_c); \ - if (rbp_right_get(a_type, a_field, rbp_r_c) \ - == &(a_tree)->rbt_nil) { \ - /* Delete leaf node. */\ - if (rbp_left_get(a_type, a_field, rbp_r_c) \ - != &(a_tree)->rbt_nil) { \ - rbp_lean_right(a_type, a_field, rbp_r_c, \ - rbp_r_t); \ - rbp_right_set(a_type, a_field, rbp_r_t, \ - &(a_tree)->rbt_nil); \ - } else { \ - rbp_r_t = &(a_tree)->rbt_nil; \ - } \ - if (rbp_left_get(a_type, a_field, rbp_r_p) \ - == rbp_r_c) { \ - rbp_left_set(a_type, a_field, rbp_r_p, \ - rbp_r_t); \ - } else { \ - rbp_right_set(a_type, a_field, rbp_r_p, \ - rbp_r_t); \ - } \ - break; \ - } else { \ - /* This is the node we want to delete, but we */\ - /* will instead swap it with its successor */\ - /* and delete the successor. Record enough */\ - /* information to do the swap later. */\ - /* rbp_r_xp is a_node's parent. */\ - rbp_r_xp = rbp_r_p; \ - } \ - } \ - rbp_r_t = rbp_right_get(a_type, a_field, rbp_r_c); \ - rbp_r_u = rbp_left_get(a_type, a_field, rbp_r_t); \ - if (rbp_red_get(a_type, a_field, rbp_r_u) == false) { \ - rbp_move_red_right(a_type, a_field, rbp_r_c, \ - rbp_r_t); \ - if (rbp_left_get(a_type, a_field, rbp_r_p) \ - == rbp_r_c) { \ - rbp_left_set(a_type, a_field, rbp_r_p, rbp_r_t);\ - } else { \ - rbp_right_set(a_type, a_field, rbp_r_p, \ - rbp_r_t); \ - } \ - rbp_r_c = rbp_r_t; \ - } else { \ - rbp_r_p = rbp_r_c; \ - rbp_r_c = rbp_right_get(a_type, a_field, rbp_r_c); \ - } \ - } \ - } \ - } \ - /* Update root. */\ - (a_tree)->rbt_root = rbp_left_get(a_type, a_field, &rbp_r_s); \ -} while (0) - -/* - * The rb_wrap() macro provides a convenient way to wrap functions around the - * cpp macros. The main benefits of wrapping are that 1) repeated macro - * expansion can cause code bloat, especially for rb_{insert,remove)(), and - * 2) type, linkage, comparison functions, etc. need not be specified at every - * call point. - */ - -#define rb_wrap(a_attr, a_prefix, a_tree_type, a_type, a_field, a_cmp) \ -a_attr void \ -a_prefix##new(a_tree_type *tree) { \ - rb_new(a_type, a_field, tree); \ -} \ -a_attr a_type * \ -a_prefix##first(a_tree_type *tree) { \ - a_type *ret; \ - rb_first(a_type, a_field, tree, ret); \ - return (ret); \ -} \ -a_attr a_type * \ -a_prefix##last(a_tree_type *tree) { \ - a_type *ret; \ - rb_last(a_type, a_field, tree, ret); \ - return (ret); \ -} \ -a_attr a_type * \ -a_prefix##next(a_tree_type *tree, a_type *node) { \ - a_type *ret; \ - rb_next(a_type, a_field, a_cmp, tree, node, ret); \ - return (ret); \ -} \ -a_attr a_type * \ -a_prefix##prev(a_tree_type *tree, a_type *node) { \ - a_type *ret; \ - rb_prev(a_type, a_field, a_cmp, tree, node, ret); \ - return (ret); \ -} \ -a_attr a_type * \ -a_prefix##search(a_tree_type *tree, a_type *key) { \ - a_type *ret; \ - rb_search(a_type, a_field, a_cmp, tree, key, ret); \ - return (ret); \ -} \ -a_attr a_type * \ -a_prefix##nsearch(a_tree_type *tree, a_type *key) { \ - a_type *ret; \ - rb_nsearch(a_type, a_field, a_cmp, tree, key, ret); \ - return (ret); \ -} \ -a_attr a_type * \ -a_prefix##psearch(a_tree_type *tree, a_type *key) { \ - a_type *ret; \ - rb_psearch(a_type, a_field, a_cmp, tree, key, ret); \ - return (ret); \ -} \ -a_attr void \ -a_prefix##insert(a_tree_type *tree, a_type *node) { \ - rb_insert(a_type, a_field, a_cmp, tree, node); \ -} \ -a_attr void \ -a_prefix##remove(a_tree_type *tree, a_type *node) { \ - rb_remove(a_type, a_field, a_cmp, tree, node); \ -} - -/* - * The iterators simulate recursion via an array of pointers that store the - * current path. This is critical to performance, since a series of calls to - * rb_{next,prev}() would require time proportional to (n lg n), whereas this - * implementation only requires time proportional to (n). - * - * Since the iterators cache a path down the tree, any tree modification may - * cause the cached path to become invalid. In order to continue iteration, - * use something like the following sequence: - * - * { - * a_type *node, *tnode; - * - * rb_foreach_begin(a_type, a_field, a_tree, node) { - * ... - * rb_next(a_type, a_field, a_cmp, a_tree, node, tnode); - * rb_remove(a_type, a_field, a_cmp, a_tree, node); - * rb_foreach_next(a_type, a_field, a_cmp, a_tree, tnode); - * ... - * } rb_foreach_end(a_type, a_field, a_tree, node) - * } - * - * Note that this idiom is not advised if every iteration modifies the tree, - * since in that case there is no algorithmic complexity improvement over a - * series of rb_{next,prev}() calls, thus making the setup overhead wasted - * effort. - */ - -#define rb_foreach_begin(a_type, a_field, a_tree, a_var) { \ - /* Compute the maximum possible tree depth (3X the black height). */\ - unsigned rbp_f_height; \ - rbp_black_height(a_type, a_field, a_tree, rbp_f_height); \ - rbp_f_height *= 3; \ - { \ - /* Initialize the path to contain the left spine. */\ - a_type *rbp_f_path[rbp_f_height]; \ - a_type *rbp_f_node; \ - bool rbp_f_synced = false; \ - unsigned rbp_f_depth = 0; \ - if ((a_tree)->rbt_root != &(a_tree)->rbt_nil) { \ - rbp_f_path[rbp_f_depth] = (a_tree)->rbt_root; \ - rbp_f_depth++; \ - while ((rbp_f_node = rbp_left_get(a_type, a_field, \ - rbp_f_path[rbp_f_depth-1])) != &(a_tree)->rbt_nil) { \ - rbp_f_path[rbp_f_depth] = rbp_f_node; \ - rbp_f_depth++; \ - } \ - } \ - /* While the path is non-empty, iterate. */\ - while (rbp_f_depth > 0) { \ - (a_var) = rbp_f_path[rbp_f_depth-1]; - -/* Only use if modifying the tree during iteration. */ -#define rb_foreach_next(a_type, a_field, a_cmp, a_tree, a_node) \ - /* Re-initialize the path to contain the path to a_node. */\ - rbp_f_depth = 0; \ - if (a_node != NULL) { \ - if ((a_tree)->rbt_root != &(a_tree)->rbt_nil) { \ - rbp_f_path[rbp_f_depth] = (a_tree)->rbt_root; \ - rbp_f_depth++; \ - rbp_f_node = rbp_f_path[0]; \ - while (true) { \ - int rbp_f_cmp = (a_cmp)((a_node), \ - rbp_f_path[rbp_f_depth-1]); \ - if (rbp_f_cmp < 0) { \ - rbp_f_node = rbp_left_get(a_type, a_field, \ - rbp_f_path[rbp_f_depth-1]); \ - } else if (rbp_f_cmp > 0) { \ - rbp_f_node = rbp_right_get(a_type, a_field, \ - rbp_f_path[rbp_f_depth-1]); \ - } else { \ - break; \ - } \ - assert(rbp_f_node != &(a_tree)->rbt_nil); \ - rbp_f_path[rbp_f_depth] = rbp_f_node; \ - rbp_f_depth++; \ - } \ - } \ - } \ - rbp_f_synced = true; - -#define rb_foreach_end(a_type, a_field, a_tree, a_var) \ - if (rbp_f_synced) { \ - rbp_f_synced = false; \ - continue; \ - } \ - /* Find the successor. */\ - if ((rbp_f_node = rbp_right_get(a_type, a_field, \ - rbp_f_path[rbp_f_depth-1])) != &(a_tree)->rbt_nil) { \ - /* The successor is the left-most node in the right */\ - /* subtree. */\ - rbp_f_path[rbp_f_depth] = rbp_f_node; \ - rbp_f_depth++; \ - while ((rbp_f_node = rbp_left_get(a_type, a_field, \ - rbp_f_path[rbp_f_depth-1])) != &(a_tree)->rbt_nil) { \ - rbp_f_path[rbp_f_depth] = rbp_f_node; \ - rbp_f_depth++; \ - } \ - } else { \ - /* The successor is above the current node. Unwind */\ - /* until a left-leaning edge is removed from the */\ - /* path, or the path is empty. */\ - for (rbp_f_depth--; rbp_f_depth > 0; rbp_f_depth--) { \ - if (rbp_left_get(a_type, a_field, \ - rbp_f_path[rbp_f_depth-1]) \ - == rbp_f_path[rbp_f_depth]) { \ - break; \ - } \ - } \ - } \ - } \ - } \ -} - -#define rb_foreach_reverse_begin(a_type, a_field, a_tree, a_var) { \ - /* Compute the maximum possible tree depth (3X the black height). */\ - unsigned rbp_fr_height; \ - rbp_black_height(a_type, a_field, a_tree, rbp_fr_height); \ - rbp_fr_height *= 3; \ - { \ - /* Initialize the path to contain the right spine. */\ - a_type *rbp_fr_path[rbp_fr_height]; \ - a_type *rbp_fr_node; \ - bool rbp_fr_synced = false; \ - unsigned rbp_fr_depth = 0; \ - if ((a_tree)->rbt_root != &(a_tree)->rbt_nil) { \ - rbp_fr_path[rbp_fr_depth] = (a_tree)->rbt_root; \ - rbp_fr_depth++; \ - while ((rbp_fr_node = rbp_right_get(a_type, a_field, \ - rbp_fr_path[rbp_fr_depth-1])) != &(a_tree)->rbt_nil) { \ - rbp_fr_path[rbp_fr_depth] = rbp_fr_node; \ - rbp_fr_depth++; \ - } \ - } \ - /* While the path is non-empty, iterate. */\ - while (rbp_fr_depth > 0) { \ - (a_var) = rbp_fr_path[rbp_fr_depth-1]; - -/* Only use if modifying the tree during iteration. */ -#define rb_foreach_reverse_prev(a_type, a_field, a_cmp, a_tree, a_node) \ - /* Re-initialize the path to contain the path to a_node. */\ - rbp_fr_depth = 0; \ - if (a_node != NULL) { \ - if ((a_tree)->rbt_root != &(a_tree)->rbt_nil) { \ - rbp_fr_path[rbp_fr_depth] = (a_tree)->rbt_root; \ - rbp_fr_depth++; \ - rbp_fr_node = rbp_fr_path[0]; \ - while (true) { \ - int rbp_fr_cmp = (a_cmp)((a_node), \ - rbp_fr_path[rbp_fr_depth-1]); \ - if (rbp_fr_cmp < 0) { \ - rbp_fr_node = rbp_left_get(a_type, a_field, \ - rbp_fr_path[rbp_fr_depth-1]); \ - } else if (rbp_fr_cmp > 0) { \ - rbp_fr_node = rbp_right_get(a_type, a_field,\ - rbp_fr_path[rbp_fr_depth-1]); \ - } else { \ - break; \ - } \ - assert(rbp_fr_node != &(a_tree)->rbt_nil); \ - rbp_fr_path[rbp_fr_depth] = rbp_fr_node; \ - rbp_fr_depth++; \ - } \ - } \ - } \ - rbp_fr_synced = true; - -#define rb_foreach_reverse_end(a_type, a_field, a_tree, a_var) \ - if (rbp_fr_synced) { \ - rbp_fr_synced = false; \ - continue; \ - } \ - if (rbp_fr_depth == 0) { \ - /* rb_foreach_reverse_sync() was called with a NULL */\ - /* a_node. */\ - break; \ - } \ - /* Find the predecessor. */\ - if ((rbp_fr_node = rbp_left_get(a_type, a_field, \ - rbp_fr_path[rbp_fr_depth-1])) != &(a_tree)->rbt_nil) { \ - /* The predecessor is the right-most node in the left */\ - /* subtree. */\ - rbp_fr_path[rbp_fr_depth] = rbp_fr_node; \ - rbp_fr_depth++; \ - while ((rbp_fr_node = rbp_right_get(a_type, a_field, \ - rbp_fr_path[rbp_fr_depth-1])) != &(a_tree)->rbt_nil) {\ - rbp_fr_path[rbp_fr_depth] = rbp_fr_node; \ - rbp_fr_depth++; \ - } \ - } else { \ - /* The predecessor is above the current node. Unwind */\ - /* until a right-leaning edge is removed from the */\ - /* path, or the path is empty. */\ - for (rbp_fr_depth--; rbp_fr_depth > 0; rbp_fr_depth--) {\ - if (rbp_right_get(a_type, a_field, \ - rbp_fr_path[rbp_fr_depth-1]) \ - == rbp_fr_path[rbp_fr_depth]) { \ - break; \ - } \ - } \ - } \ - } \ - } \ -} - -#endif /* RB_H_ */ From tfheen at varnish-cache.org Wed Jun 12 11:15:35 2013 From: tfheen at varnish-cache.org (Tollef Fog Heen) Date: Wed, 12 Jun 2013 13:15:35 +0200 Subject: [master] 7754eb3 Clear up time to first byte description in varnishncsa man page Message-ID: commit 7754eb3baecceadb9051dc4f768e09c6ce0ef176 Author: Tollef Fog Heen Date: Wed Jun 12 13:12:02 2013 +0200 Clear up time to first byte description in varnishncsa man page Fixes #1305 diff --git a/doc/sphinx/reference/varnishncsa.rst b/doc/sphinx/reference/varnishncsa.rst index 8cd7b05..3acac45 100644 --- a/doc/sphinx/reference/varnishncsa.rst +++ b/doc/sphinx/reference/varnishncsa.rst @@ -112,7 +112,8 @@ The following options are available: Extended variables. Supported variables are: Varnish:time_firstbyte - Time to the first byte from the backend arrived + Time from when the request processing starts + until the first byte is sent to the client. Varnish:hitmiss Whether the request was a cache hit or miss. Pipe From tfheen at varnish-cache.org Wed Jun 12 11:15:35 2013 From: tfheen at varnish-cache.org (Tollef Fog Heen) Date: Wed, 12 Jun 2013 13:15:35 +0200 Subject: [master] 56a571a Handle input from stdin properly in varnishadm Message-ID: commit 56a571a70f9e00545b5d5f08437fc65425c1a573 Author: Tollef Fog Heen Date: Wed Jun 12 13:07:36 2013 +0200 Handle input from stdin properly in varnishadm readline doesn't really handle when input comes from a file or pipe, so only use readline if stdin is a tty. Thanks to johnnyrun for a patch which was used for inspiration here. Fixes #1314 diff --git a/bin/varnishadm/varnishadm.c b/bin/varnishadm/varnishadm.c index 4029178..0e5706b 100644 --- a/bin/varnishadm/varnishadm.c +++ b/bin/varnishadm/varnishadm.c @@ -227,7 +227,7 @@ varnishadm_completion (const char *text, int start, int end) * Send a "banner" to varnish, to provoke a welcome message. */ static void -pass(int sock) +interactive(int sock) { struct pollfd fds[2]; char buf[1024]; @@ -236,11 +236,7 @@ pass(int sock) unsigned u, status; _line_sock = sock; rl_already_prompted = 1; - if (isatty(0)) { - rl_callback_handler_install("varnish> ", send_line); - } else { - rl_callback_handler_install("", send_line); - } + rl_callback_handler_install("varnish> ", send_line); rl_attempted_completion_function = varnishadm_completion; fds[0].fd = sock; @@ -311,6 +307,65 @@ pass(int sock) } } +/* + * No arguments given, simply pass bytes on stdin/stdout and CLI socket + */ +static void +pass(int sock) +{ + struct pollfd fds[2]; + char buf[1024]; + int i; + char *answer = NULL; + unsigned u, status; + ssize_t n; + + fds[0].fd = sock; + fds[0].events = POLLIN; + fds[1].fd = 0; + fds[1].events = POLLIN; + while (1) { + i = poll(fds, 2, -1); + if (i == -1 && errno == EINTR) { + continue; + } + assert(i > 0); + if (fds[0].revents & POLLIN) { + u = VCLI_ReadResult(fds[0].fd, &status, &answer, + timeout); + if (u) { + if (status == CLIS_COMMS) + RL_EXIT(0); + if (answer) + fprintf(stderr, "%s\n", answer); + RL_EXIT(1); + } + + sprintf(buf, "%u\n", status); + u = write(1, buf, strlen(buf)); + if (answer) { + u = write(1, answer, strlen(answer)); + u = write(1, "\n", 1); + free(answer); + answer = NULL; + } + } + if (fds[1].revents & POLLIN || fds[1].revents & POLLHUP) { + n = read(fds[1].fd, buf, sizeof buf); + if (n == 0) { + AZ(shutdown(sock, SHUT_WR)); + fds[1].fd = -1; + } else if (n < 0) { + RL_EXIT(0); + } else { + buf[n] = '\0'; + cli_write(sock, buf); + } + } + } +} + + static void usage(void) { @@ -414,8 +469,12 @@ main(int argc, char * const *argv) if (argc > 0) do_args(sock, argc, argv); - else - pass(sock); - + else { + if (isatty(0)) { + interactive(sock); + } else { + pass(sock); + } + } exit(0); } From tfheen at varnish-cache.org Wed Jun 12 11:25:27 2013 From: tfheen at varnish-cache.org (Tollef Fog Heen) Date: Wed, 12 Jun 2013 13:25:27 +0200 Subject: [master] 41721fd Fix path in initscript Message-ID: commit 41721fdb41c4ec9a2ed9346fa3b1f27b1c2fe6bb Author: Tollef Fog Heen Date: Wed Jun 12 13:24:19 2013 +0200 Fix path in initscript Thanks to shin1x1 for the patch. diff --git a/redhat/varnish.initrc b/redhat/varnish.initrc index 4afa6c5..c2faa25 100755 --- a/redhat/varnish.initrc +++ b/redhat/varnish.initrc @@ -26,7 +26,7 @@ retval=0 pidfile=/var/run/varnish.pid exec="/usr/sbin/varnishd" -reload_exec="/usr/bin/varnish_reload_vcl" +reload_exec="/usr/sbin/varnish_reload_vcl" prog="varnishd" config="/etc/sysconfig/varnish" lockfile="/var/lock/subsys/varnish" From perbu at varnish-cache.org Wed Jun 12 13:11:58 2013 From: perbu at varnish-cache.org (Per Buer) Date: Wed, 12 Jun 2013 15:11:58 +0200 Subject: [master] f9784fe copy-edits https://github.com/varnish/Varnish-Cache/pull/13 Message-ID: commit f9784fea2851a32e189955018610ce07e238d6e1 Author: Per Buer Date: Wed Jun 12 15:00:21 2013 +0200 copy-edits https://github.com/varnish/Varnish-Cache/pull/13 by xiongchiamiov diff --git a/doc/sphinx/phk/spdy.rst b/doc/sphinx/phk/spdy.rst index 68bf000..7cc1f4f 100644 --- a/doc/sphinx/phk/spdy.rst +++ b/doc/sphinx/phk/spdy.rst @@ -8,7 +8,7 @@ It's dawning on me that I'm sort of the hipster of hipsters, in the sense that I tend to do things far before other people do, but totally fail to communicate what's going on out there in the future, and thus by the time the "real hipsters" catch up, I'm already somewhere different and -more insteresting. +more interesting. My one lucky break was the `bikeshed email `_ where I actually did sit down and compose some of my thoughts, thus firmly @@ -27,18 +27,18 @@ The evolution of Varnish When we started out, seven years ago, our only and entire goal was to build a server-side cache better than squid. That we did. -Since then we have added stuff to Varnish, ESI:includes, gzip support, -VMODS and I'm staring at streaming and conditional backend fetches right +Since then we have added stuff to Varnish (ESI:includes, gzip support, +VMODS) and I'm staring at streaming and conditional backend fetches right now. Varnish is a bit more than a web-cache now, but it is still, basically, -a layer of polish you put in front of your webserver to get it too +a layer of polish you put in front of your webserver to get it to look and work better. -Googles experiments with SPDY have forced a HTTP/2.0 effort into motion, +Google's experiments with SPDY have forced a HTTP/2.0 effort into motion, but if past performance is any indication, that is not something we have -to really worry about for a number of years, the IETF WG has still to -manage to "clarify" RFC2616 which defines HTTP/1.1 and to say there +to really worry about for a number of years. The IETF WG has still to +manage to "clarify" RFC2616 which defines HTTP/1.1, and to say there is anything even remotely resembling consensus behind SPDY would be a downright lie. @@ -46,20 +46,20 @@ RFC2616 is from June 1999, which, to me, means that we should look at 2035 when we design HTTP/2.0, and predicting things is well known to be hard, in particular with respect to the future. -So what's a Varnish architect to do ? +So what's a Varnish architect to do? What I did this summer vacation, was to think a lot about how Varnish can be architected to cope with the kind of changes SPDY and maybe HTTP/2.0 -drag in: Pipelining, multiplexing etc, without committing us to one +drag in: Pipelining, multiplexing, etc., without committing us to one particular path of science fiction about life in 2035. -Profound insights often sound incredibly simplistic bordering +Profound insights often sound incredibly simplistic, bordering trivial, until you consider the full ramifications. The implementation -of "Do Not Kill" is in current law is surprisingly voluminous. (If -you don't think so, you probably forgot to #include the Wienna +of "Do Not Kill" in current law is surprisingly voluminous. (If +you don't think so, you probably forgot to #include the Vienna Treaty and the convention about chemical and biological weapons.) -So my insight about Varnish, that it has to become a socket-wrench like +So my insight about Varnish, that it has to become a socket-wrench-like toolchest for doing things with HTTP traffic, will probably elicit a lot of "duh!" reactions, until people, including me, understand the ramifications more fully. @@ -77,7 +77,7 @@ of finite sized data elements. That is not how the future looks. -For instance one of the things SPDY have tried out is "server push", +For instance one of the things SPDY has tried out is "server push", where you fetch index.html and the webserver says "you'll also want main.css and cat.gif then" and pushes those objects on the client, to save the round-trip times wasted waiting for the client to ask @@ -87,7 +87,7 @@ Today, something like that is impossible in Varnish, since objects are independent and you can only look up one at a time. I already can hear some of you amazing VCL wizards say "Well, -if you inline-C grab a refcount, then restart and ..." but lets +if you inline-C grab a refcount, then restart and ..." but let's be honest, that's not how it should look. You should be able to do something like:: @@ -107,21 +107,21 @@ And doing that is not really *that* hard, I think. We just need to keep track of all the objects we instantiate and make sure they disappear and die when nobody is using them any more. -But a lot of the assumptions we made back in 2006 are no longer +A lot of the assumptions we made back in 2006 are no longer valid under such an architecture, but those same assumptions are what gives Varnish such astonishing performance, so just replacing them with standard CS-textbook solutions like "garbage collection" -would make Varnish loose a lot of its lustre. +would make Varnish lose a lot of its lustre. As some of you know, there is a lot of modularity hidden inside -Varnish but not quite released for public use in VCL, much of what -is going to happen, will be polishing up and documenting that +Varnish but not quite released for public use in VCL. Much of what +is going to happen will be polishing up and documenting that modularity and releasing it for you guys to have fun with, so it is not like we are starting from scratch or anything. But some of that modularity stands on foundations which are no longer -firm, for instance that the initiating request exists for the -full duration of a backend fetch. +firm; for instance, the initiating request exists for the full duration of +a backend fetch. Those will take some work to fix. diff --git a/doc/sphinx/phk/varnish_does_not_hash.rst b/doc/sphinx/phk/varnish_does_not_hash.rst index 8393c87..e03f078 100644 --- a/doc/sphinx/phk/varnish_does_not_hash.rst +++ b/doc/sphinx/phk/varnish_does_not_hash.rst @@ -14,7 +14,7 @@ Varnish does not hash, at least not by default, and even if it does, it's still as immune to the attacks as can be. To understand what is going on, I have to introduce a concept from -Shannons information theory: "entropy." +Shannon's information theory: "entropy." Entropy is hard to explain, and according to legend, that is exactly why Shannon recycled that term from thermodynamics. @@ -35,10 +35,10 @@ storing the objects in an array indexed by that key. Typically, but not always, the key is a string and the index is a (smallish) integer, and the job of the hash-function is to squeeze -the key into the integer, without loosing any of the entropy. +the key into the integer, without losing any of the entropy. Needless to say, the more entropy you have to begin with, the more -of it you can afford to loose, and loose some you almost invariably +of it you can afford to lose, and lose some you almost invariably will. There are two families of hash-functions, the fast ones, and the good @@ -64,12 +64,12 @@ What Varnish Does ----------------- The way to avoid having hash-collisions is to not use a hash: Use a -tree instead, there every object has its own place and there are no +tree instead. There every object has its own place and there are no collisions. Varnish does that, but with a twist. -The "keys" in varnish can be very long, by default they consist of:: +The "keys" in Varnish can be very long; by default they consist of:: sub vcl_hash { hash_data(req.url); @@ -98,7 +98,7 @@ each object in the far too common case seen above. But furthermore, we want the tree to be very fast to do lookups in, preferably it should be lockless for lookups, and that means that we cannot (realistically) use any of the "smart" trees which -automatically balance themselves etc. +automatically balance themselves, etc. You (generally) don't need a "smart" tree if your keys look like random data in the order they arrive, but we can pretty @@ -109,8 +109,8 @@ But we can make the keys look random, and make them small and fixed size at the same time, and the perfect functions designed for just that task are the "good" hash-functions, the cryptographic ones. -So what Varnish does is "key-compression": All the strings hash_data() -are fed, are pushed through a cryptographic hash algorithm called +So what Varnish does is "key-compression": All the strings fed to +hash_data() are pushed through a cryptographic hash algorithm called SHA256, which, as the name says, always spits out 256 bits (= 32 bytes), no matter how many bits you feed it. @@ -134,8 +134,8 @@ That should be random enough. But the key-compression does introduce a risk of collisions, since not even SHA256 can guarantee different outputs for all possible -inputs: Try pushing all the possible 33 bytes long files through -SHA256 and sooner or later you will get collisions. +inputs: Try pushing all the possible 33-byte files through SHA256 +and sooner or later you will get collisions. The risk of collision is very small however, and I can all but promise you, that you will be fully offset in fame and money for From perbu at varnish-cache.org Wed Jun 12 13:11:58 2013 From: perbu at varnish-cache.org (Per Buer) Date: Wed, 12 Jun 2013 15:11:58 +0200 Subject: [master] b3d1702 typo and whitespace https://github.com/varnish/Varnish-Cache/pull/16 Message-ID: commit b3d1702a19720fd04e645ba45644244c258b14fe Author: Per Buer Date: Wed Jun 12 15:03:26 2013 +0200 typo and whitespace https://github.com/varnish/Varnish-Cache/pull/16 by psa diff --git a/doc/sphinx/reference/varnish-cli.rst b/doc/sphinx/reference/varnish-cli.rst index c34ae7b..45bfc92 100644 --- a/doc/sphinx/reference/varnish-cli.rst +++ b/doc/sphinx/reference/varnish-cli.rst @@ -21,14 +21,14 @@ without interrupting the running service. The CLI can be used for the following tasks: configuration - You can upload, change and delete VCL files from the CLI. + You can upload, change and delete VCL files from the CLI. -parameters +parameters You can inspect and change the various parameters Varnish has available through the CLI. The individual parameters are documented in the varnishd(1) man page. -bans +bans Bans are filters that are applied to keep Varnish from serving stale content. When you issue a ban Varnish will not serve any *banned* object from cache, but rather re-fetch it from its @@ -233,13 +233,13 @@ An authenticated session looks like this:: Trying 127.0.0.1... Connected to localhost. Escape character is '^]'. - 107 59 + 107 59 ixslvvxrgkjptxmcgnnsdxsvdmvfympg - + Authentication required. - + auth 455ce847f0073c7ab3b1465f74507b75d3dc064c1e7de3b71e00de9092fdc89a - 200 193 + 200 193 ----------------------------- Varnish HTTP accelerator CLI. ----------------------------- @@ -280,7 +280,7 @@ In the above example, the secret file contained foo\n and thus:: 00000030 70 74 78 6d 63 67 6e 6e 73 64 78 73 76 64 6d 76 |ptxmcgnnsdxsvdmv| 00000040 66 79 6d 70 67 0a |fympg.| 00000046 - critter phk> sha256 _ + critter phk> sha256 _ SHA256 (_) = 455ce847f0073c7ab3b1465f74507b75d3dc064c1e7de3b71e00de9092fdc89a critter phk> openssl dgst -sha256 < _ 455ce847f0073c7ab3b1465f74507b75d3dc064c1e7de3b71e00de9092fdc89a From perbu at varnish-cache.org Wed Jun 12 13:11:58 2013 From: perbu at varnish-cache.org (Per Buer) Date: Wed, 12 Jun 2013 15:11:58 +0200 Subject: [master] f7361a8 doc .initial https://github.com/varnish/Varnish-Cache/pull/14 Message-ID: commit f7361a851cd9ae650c0efed6e143fb7391d958fd Author: Per Buer Date: Wed Jun 12 15:07:42 2013 +0200 doc .initial https://github.com/varnish/Varnish-Cache/pull/14 diff --git a/doc/sphinx/reference/vcl.rst b/doc/sphinx/reference/vcl.rst index 01a52ad..ab63679 100644 --- a/doc/sphinx/reference/vcl.rst +++ b/doc/sphinx/reference/vcl.rst @@ -197,9 +197,12 @@ Probes take the following parameters: How many of the latest polls we examine to determine backend health. Defaults to 8. .threshold - How many of the polls in .window must have succeeded for us to consider - the backend healthy. - Defaults to 3. + How many of the polls in .window must have succeeded for us to + consider the backend healthy. If this is set to more than or equal + to the threshold, the backend starts as healthy. Defaults to the + value of threshold - 1. In this case, the backend starts as sick and + requires one poll to pass to become healthy. + Defaults to threshold - 1. .initial How many of the polls in .window are considered good when Varnish starts. Defaults to the value of threshold - 1. In this case, the From tfheen at varnish-cache.org Thu Jun 13 09:00:58 2013 From: tfheen at varnish-cache.org (Tollef Fog Heen) Date: Thu, 13 Jun 2013 11:00:58 +0200 Subject: [3.0] 496e943 Document changes Message-ID: commit 496e9432426dedf962e0fbc6aa9db50e0ecf5ad3 Author: Tollef Fog Heen Date: Thu Jun 13 10:51:18 2013 +0200 Document changes diff --git a/doc/changes.rst b/doc/changes.rst index 827a287..f8d2d39 100644 --- a/doc/changes.rst +++ b/doc/changes.rst @@ -1,4 +1,20 @@ ================================ +Changes from 3.0.4 rc 1 to 3.0.4 +================================ + +varnishd +-------- + +- Set the waiter pipe as non-blocking and record overflows. `Bug + #1285` +- Fix up a bug in the ACL compile code that could lead to false + negatives. CVE-2013-4090. `Bug #1312` +- Return an error if the client sends multiple Host headers. + +.. _bug #1285: http://varnish-cache.org/trac/ticket/1285 +.. _bug #1312: http://varnish-cache.org/trac/ticket/1312 + +================================ Changes from 3.0.3 to 3.0.4 rc 1 ================================ From tfheen at varnish-cache.org Thu Jun 13 09:22:18 2013 From: tfheen at varnish-cache.org (Tollef Fog Heen) Date: Thu, 13 Jun 2013 11:22:18 +0200 Subject: [3.0] 9901ee2 Compile fix for FreeBSD; unused variable Message-ID: commit 9901ee24a01076c06850f6207ab770fa48b75131 Author: Tollef Fog Heen Date: Thu Jun 13 11:22:12 2013 +0200 Compile fix for FreeBSD; unused variable diff --git a/bin/varnishd/cache_waiter_kqueue.c b/bin/varnishd/cache_waiter_kqueue.c index 48fad20..36bbf55 100644 --- a/bin/varnishd/cache_waiter_kqueue.c +++ b/bin/varnishd/cache_waiter_kqueue.c @@ -205,8 +205,6 @@ vca_kqueue_main(void *arg) static void vca_kqueue_init(void) { - int i; - AZ(vnonblocking(vca_pipes[0])); AZ(vnonblocking(vca_pipes[1])); diff --git a/configure.ac b/configure.ac index 750c7f8..72d144f 100644 --- a/configure.ac +++ b/configure.ac @@ -2,7 +2,7 @@ AC_PREREQ(2.59) AC_COPYRIGHT([Copyright (c) 2006 Verdens Gang AS Copyright (c) 2006-2011 Varnish Software AS]) AC_REVISION([$Id$]) -AC_INIT([Varnish], [3.0.4-rc1], [varnish-dev at varnish-cache.org]) +AC_INIT([Varnish], [3.0.4], [varnish-dev at varnish-cache.org]) AC_CONFIG_SRCDIR(include/varnishapi.h) AM_CONFIG_HEADER(config.h) diff --git a/redhat/varnish.spec b/redhat/varnish.spec index 2771688..7f53b98 100644 --- a/redhat/varnish.spec +++ b/redhat/varnish.spec @@ -1,8 +1,7 @@ -%define v_rc rc1 Summary: High-performance HTTP accelerator Name: varnish Version: 3.0.4 -Release: 0.rc1%{?dist} +Release: 1%{?dist} License: BSD Group: System Environment/Daemons URL: http://www.varnish-cache.org/ From martin at varnish-cache.org Thu Jun 13 10:41:23 2013 From: martin at varnish-cache.org (Martin Blix Grydeland) Date: Thu, 13 Jun 2013 12:41:23 +0200 Subject: [master] 81898e0 Fix memory leak on multiple -n / -N options Message-ID: commit 81898e084881f35de039a8c9fa08db56449e4249 Author: Martin Blix Grydeland Date: Thu May 23 14:40:30 2013 +0200 Fix memory leak on multiple -n / -N options diff --git a/lib/libvarnishapi/vsm.c b/lib/libvarnishapi/vsm.c index f0b2602..d7252fc 100644 --- a/lib/libvarnishapi/vsm.c +++ b/lib/libvarnishapi/vsm.c @@ -132,6 +132,11 @@ VSM_n_Arg(struct VSM_data *vd, const char *opt) CHECK_OBJ_NOTNULL(vd, VSM_MAGIC); AN(opt); + if (vd->fname) { + free(vd->fname); + vd->fname = NULL; + } + vd->N_opt = 0; REPLACE(vd->n_opt, opt); if (VIN_N_Arg(vd->n_opt, NULL, NULL, &vd->fname)) return (vsm_diag(vd, "Invalid instance name: %s\n", @@ -148,6 +153,10 @@ VSM_N_Arg(struct VSM_data *vd, const char *opt) CHECK_OBJ_NOTNULL(vd, VSM_MAGIC); AN(opt); + if (vd->n_opt) { + free(vd->n_opt); + vd->n_opt = NULL; + } REPLACE(vd->fname, opt); vd->N_opt = 1; return (1); From martin at varnish-cache.org Thu Jun 13 10:41:23 2013 From: martin at varnish-cache.org (Martin Blix Grydeland) Date: Thu, 13 Jun 2013 12:41:23 +0200 Subject: [master] 7af7eee Reset vsl_log.wid in VSL_Setup. Message-ID: commit 7af7eeebe0612167476869397aa817f159cd67a1 Author: Martin Blix Grydeland Date: Thu May 16 14:53:57 2013 +0200 Reset vsl_log.wid in VSL_Setup. This ensures that background threads like expire and ban lurker logs using ID 0. diff --git a/bin/varnishd/cache/cache_shmlog.c b/bin/varnishd/cache/cache_shmlog.c index 278b29d..5b33dad 100644 --- a/bin/varnishd/cache/cache_shmlog.c +++ b/bin/varnishd/cache/cache_shmlog.c @@ -356,6 +356,7 @@ VSL_Setup(struct vsl_log *vsl, void *ptr, size_t len) vsl->wle = ptr; vsl->wle += len / sizeof(*vsl->wle); vsl->wlr = 0; + vsl->wid = 0; } /*--------------------------------------------------------------------*/ From martin at varnish-cache.org Thu Jun 13 10:41:23 2013 From: martin at varnish-cache.org (Martin Blix Grydeland) Date: Thu, 13 Jun 2013 12:41:23 +0200 Subject: [master] c3b5133 Add a VSLQ_Name2Grouping function to parse -g arguments Message-ID: commit c3b513398f7a9d324dde667be986bf4a5198c8c7 Author: Martin Blix Grydeland Date: Thu May 16 13:03:13 2013 +0200 Add a VSLQ_Name2Grouping function to parse -g arguments diff --git a/bin/varnishlog/varnishlog.c b/bin/varnishlog/varnishlog.c index f08407c..ee4b2bc 100644 --- a/bin/varnishlog/varnishlog.c +++ b/bin/varnishlog/varnishlog.c @@ -80,13 +80,12 @@ main(int argc, char * const *argv) { char optchar; int d_opt = 0; - char *g_arg = NULL; struct VSL_data *vsl; struct VSM_data *vsm; struct VSL_cursor *c; struct VSLQ *q; - enum VSL_grouping_e grouping = VSL_g_vxid; + int grouping = VSL_g_vxid; int i; vsl = VSL_New(); @@ -101,7 +100,11 @@ main(int argc, char * const *argv) break; case 'g': /* Grouping mode */ - g_arg = optarg; + grouping = VSLQ_Name2Grouping(optarg, -1); + if (grouping == -2) + error(1, "Ambiguous grouping type: %s", optarg); + else if (grouping < 0) + error(1, "Unknown grouping type: %s", optarg); break; case 'n': /* Instance name */ @@ -112,19 +115,7 @@ main(int argc, char * const *argv) usage(); } } - - if (g_arg) { - if (!strcmp(g_arg, "raw")) - grouping = VSL_g_raw; - else if (!strcmp(g_arg, "vxid")) - grouping = VSL_g_vxid; - else if (!strcmp(g_arg, "request")) - grouping = VSL_g_request; - else if (!strcmp(g_arg, "session")) - grouping = VSL_g_session; - else - error(1, "Wrong -g argument: %s", g_arg); - } + assert(grouping >= 0 && grouping <= VSL_g_session); /* Create cursor */ if (VSM_Open(vsm)) diff --git a/include/vapi/vsl.h b/include/vapi/vsl.h index 2da4684..62cb69b 100644 --- a/include/vapi/vsl.h +++ b/include/vapi/vsl.h @@ -99,6 +99,17 @@ int VSL_Name2Tag(const char *name, int l); * -2: Multiple tags match substring */ +int VSLQ_Name2Grouping(const char *name, int l); + /* + * Convert string to grouping (= enum VSL_grouping_e) + * + * Return values: + * >=0: Grouping value + * -1: No grouping type matches + * -2: Multiple grouping types match substring + */ + + struct VSL_data *VSL_New(void); int VSL_Arg(struct VSL_data *vsl, int opt, const char *arg); /* diff --git a/lib/libvarnishapi/libvarnishapi.map b/lib/libvarnishapi/libvarnishapi.map index df2e644..1e8c513 100644 --- a/lib/libvarnishapi/libvarnishapi.map +++ b/lib/libvarnishapi/libvarnishapi.map @@ -110,5 +110,6 @@ LIBVARNISHAPI_1.3 { VSLQ_Delete; VSLQ_Dispatch; VSLQ_Flush; + VSLQ_Name2Grouping; # Variables: } LIBVARNISHAPI_1.0; diff --git a/lib/libvarnishapi/vsl_arg.c b/lib/libvarnishapi/vsl_arg.c index fcbc528..c0c964a 100644 --- a/lib/libvarnishapi/vsl_arg.c +++ b/lib/libvarnishapi/vsl_arg.c @@ -83,6 +83,36 @@ VSL_Name2Tag(const char *name, int l) return (n); } +static const char *vsl_grouping[] = { + [VSL_g_raw] = "raw", + [VSL_g_vxid] = "vxid", + [VSL_g_request] = "request", + [VSL_g_session] = "session", +}; + +int +VSLQ_Name2Grouping(const char *name, int l) +{ + int i, n; + + if (l == -1) + l = strlen(name); + n = -1; + for (i = 0; i < sizeof vsl_grouping / sizeof vsl_grouping[0]; i++) { + if (!strncasecmp(name, vsl_grouping[i], l)) { + if (strlen(vsl_grouping[i]) == l) { + /* Exact match */ + return (i); + } + if (n == -1) + n = i; + else + n = -2; + } + } + return (n); +} + static int vsl_ix_arg(struct VSL_data *vsl, int opt, const char *arg) { From martin at varnish-cache.org Thu Jun 13 10:41:23 2013 From: martin at varnish-cache.org (Martin Blix Grydeland) Date: Thu, 13 Jun 2013 12:41:23 +0200 Subject: [master] a09e2e0 Add vut.[ch] to hold functions common to the utilities. Message-ID: commit a09e2e0886cce266435b7b2e2ec9db61563acaa3 Author: Martin Blix Grydeland Date: Thu May 16 18:14:01 2013 +0200 Add vut.[ch] to hold functions common to the utilities. Move some parsing of options to these files. diff --git a/bin/varnishlog/Makefile.am b/bin/varnishlog/Makefile.am index b163259..77e04a8 100644 --- a/bin/varnishlog/Makefile.am +++ b/bin/varnishlog/Makefile.am @@ -8,6 +8,8 @@ dist_man_MANS = varnishlog.1 varnishlog_SOURCES = \ varnishlog.c \ + vut.c \ + vut.h \ $(top_builddir)/lib/libvarnish/vas.c \ $(top_builddir)/lib/libvarnish/flopen.c \ $(top_builddir)/lib/libvarnish/version.c \ diff --git a/bin/varnishlog/varnishlog.c b/bin/varnishlog/varnishlog.c index ee4b2bc..1cf1217 100644 --- a/bin/varnishlog/varnishlog.c +++ b/bin/varnishlog/varnishlog.c @@ -47,26 +47,7 @@ #include "vpf.h" #include "vsb.h" #include "vtim.h" - -#include "compat/daemon.h" - -static void error(int status, const char *fmt, ...) - __printflike(2, 3); - -static void -error(int status, const char *fmt, ...) -{ - va_list ap; - - AN(fmt); - va_start(ap, fmt); - vfprintf(stderr, fmt, ap); /* XXX: syslog on daemon */ - va_end(ap); - fprintf(stderr, "\n"); - - if (status) - exit(status); -} +#include "vut.h" static void usage(void) @@ -78,56 +59,46 @@ usage(void) int main(int argc, char * const *argv) { - char optchar; - int d_opt = 0; + char opt; + struct VUT *vut; struct VSL_data *vsl; struct VSM_data *vsm; struct VSL_cursor *c; struct VSLQ *q; - int grouping = VSL_g_vxid; int i; + vut = VUT_New(); + AN(vut); vsl = VSL_New(); AN(vsl); vsm = VSM_New(); AN(vsm); - while ((optchar = getopt(argc, argv, "dg:n:r:v")) != -1) { - switch (optchar) { - case 'd': - d_opt = 1; - break; - case 'g': - /* Grouping mode */ - grouping = VSLQ_Name2Grouping(optarg, -1); - if (grouping == -2) - error(1, "Ambiguous grouping type: %s", optarg); - else if (grouping < 0) - error(1, "Unknown grouping type: %s", optarg); - break; + while ((opt = getopt(argc, argv, "dg:n:r:v")) != -1) { + switch (opt) { case 'n': /* Instance name */ if (VSM_n_Arg(vsm, optarg) > 0) break; default: - if (!VSL_Arg(vsl, optchar, optarg)) + if (!VSL_Arg(vsl, opt, optarg) && + !VUT_Arg(vut, opt, optarg)) usage(); } } - assert(grouping >= 0 && grouping <= VSL_g_session); /* Create cursor */ if (VSM_Open(vsm)) - error(1, "VSM_Open: %s", VSM_Error(vsm)); - c = VSL_CursorVSM(vsl, vsm, !d_opt); + VUT_Error(1, "VSM_Open: %s", VSM_Error(vsm)); + c = VSL_CursorVSM(vsl, vsm, !vut->d_opt); if (c == NULL) - error(1, "VSL_CursorVSM: %s", VSL_Error(vsl)); + VUT_Error(1, "VSL_CursorVSM: %s", VSL_Error(vsl)); /* Create query */ - q = VSLQ_New(vsl, &c, grouping, argv[optind]); + q = VSLQ_New(vsl, &c, vut->g_arg, argv[optind]); if (q == NULL) - error(1, "VSLQ_New: %s", VSL_Error(vsl)); + VUT_Error(1, "VSLQ_New: %s", VSL_Error(vsl)); AZ(c); while (1) { @@ -142,7 +113,7 @@ main(int argc, char * const *argv) VSL_ResetError(vsl); continue; } - q = VSLQ_New(vsl, &c, grouping, argv[optind]); + q = VSLQ_New(vsl, &c, vut->g_arg, argv[optind]); AN(q); AZ(c); } @@ -161,14 +132,14 @@ main(int argc, char * const *argv) AZ(q); if (i == -2) { /* Abandoned */ - error(0, "Log abandoned - reopening"); + VUT_Error(0, "Log abandoned - reopening"); VSM_Close(vsm); } else if (i < -2) { /* Overrun */ - error(0, "Log overrun"); + VUT_Error(0, "Log overrun"); } } else { - error(1, "Unexpected: %d", i); + VUT_Error(1, "Unexpected: %d", i); } } @@ -179,6 +150,7 @@ main(int argc, char * const *argv) } VSL_Delete(vsl); VSM_Delete(vsm); + VUT_Delete(&vut); exit(0); } diff --git a/bin/varnishlog/vut.c b/bin/varnishlog/vut.c new file mode 100644 index 0000000..b6711ec --- /dev/null +++ b/bin/varnishlog/vut.c @@ -0,0 +1,112 @@ +/*- + * Copyright (c) 2006 Verdens Gang AS + * Copyright (c) 2006-2013 Varnish Software AS + * All rights reserved. + * + * Author: Martin Blix Grydeland + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND + * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE + * ARE DISCLAIMED. IN NO EVENT SHALL AUTHOR OR CONTRIBUTORS BE LIABLE + * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL + * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS + * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) + * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT + * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY + * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF + * SUCH DAMAGE. + * + * Common functions for the utilities + */ + +#include +#include +#include +#include +#include +#include +#include + +#include "compat/daemon.h" +#include "vapi/vsm.h" +#include "vapi/vsc.h" +#include "vapi/vsl.h" +#include "vas.h" +#include "miniobj.h" + +#include "vut.h" + +void +VUT_Error(int status, const char *fmt, ...) +{ + va_list ap; + + AN(fmt); + va_start(ap, fmt); + vfprintf(stderr, fmt, ap); + va_end(ap); + fprintf(stderr, "\n"); + + if (status) + exit(status); +} + +struct VUT* +VUT_New(void) +{ + struct VUT *vut; + + vut = calloc(1, sizeof *vut); + AN(vut); + vut->g_arg = VSL_g_vxid; + + return (vut); +} + +void +VUT_Delete(struct VUT **pvut) +{ + struct VUT *vut; + + AN(pvut); + vut = *pvut; + *pvut = NULL; + AN(vut); + + free(vut->r_arg); + + free(vut); +} + +int +VUT_g_Arg(struct VUT *vut, const char *arg) +{ + + vut->g_arg = VSLQ_Name2Grouping(arg, -1); + if (vut->g_arg == -2) + VUT_Error(1, "Ambigous grouping type: %s", arg); + else if (vut->g_arg < 0) + VUT_Error(1, "Unknown grouping type: %s", arg); + return (1); +} + +int +VUT_Arg(struct VUT *vut, int opt, const char *arg) +{ + switch (opt) { + case 'd': vut->d_opt = 1; return (1); + case 'g': return (VUT_g_Arg(vut, arg)); + case 'r': REPLACE(vut->r_arg, arg); return (1); + default: return (0); + } +} diff --git a/bin/varnishlog/vut.h b/bin/varnishlog/vut.h new file mode 100644 index 0000000..6d72755 --- /dev/null +++ b/bin/varnishlog/vut.h @@ -0,0 +1,49 @@ +/*- + * Copyright (c) 2006 Verdens Gang AS + * Copyright (c) 2006-2013 Varnish Software AS + * All rights reserved. + * + * Author: Martin Blix Grydeland + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND + * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE + * ARE DISCLAIMED. IN NO EVENT SHALL AUTHOR OR CONTRIBUTORS BE LIABLE + * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL + * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS + * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) + * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT + * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY + * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF + * SUCH DAMAGE. + * + * Common functions for the utilities + */ + +#include "vdef.h" + +struct VUT { + int d_opt; + int g_arg; + char *r_arg; +}; + +void VUT_Error(int status, const char *fmt, ...) + __printflike(2, 3); + +struct VUT *VUT_New(void); + +void VUT_Delete(struct VUT **pvut); + +int VUT_g_Arg(struct VUT *vut, const char *arg); + +int VUT_Arg(struct VUT *vut, int opt, const char *arg); From martin at varnish-cache.org Thu Jun 13 10:41:23 2013 From: martin at varnish-cache.org (Martin Blix Grydeland) Date: Thu, 13 Jun 2013 12:41:23 +0200 Subject: [master] 0d07107 Add VSL_Write* functions for writing binary log output. Message-ID: commit 0d07107e8cb19dfbe4faa8ef1ecb98d665e124f3 Author: Martin Blix Grydeland Date: Wed May 22 10:52:03 2013 +0200 Add VSL_Write* functions for writing binary log output. diff --git a/include/vapi/vsl.h b/include/vapi/vsl.h index 62cb69b..30507c0 100644 --- a/include/vapi/vsl.h +++ b/include/vapi/vsl.h @@ -279,6 +279,52 @@ int VSL_PrintTransactions(struct VSL_data *vsl, * !=0: Return value from either VSL_Next or VSL_Print */ +FILE *VSL_WriteOpen(struct VSL_data *vsl, const char *name, int append, + int unbuffered); + /* + * Open file name for writing using the VSL_Write* functions. If + * append is true, the file will be opened for appending. + * + * Arguments: + * vsl: The VSL data context + * name: The file name + * append: If true, the file will be appended instead of truncated + * unbuf: If true, use unbuffered mode + * + * Return values: + * NULL: Error - see VSL_Error + * non-NULL: Success + */ + + +int VSL_Write(struct VSL_data *vsl, const struct VSL_cursor *c, void *fo); + /* + * Write the currect record pointed to be c to the FILE* fo + * + * Return values: + * 0: Success + * -5: I/O error - see VSL_Error + */ + +int VSL_WriteAll(struct VSL_data *vsl, struct VSL_cursor *c, void *fo); + /* + * Calls VSL_Next on c until c is exhausted. In turn calls + * VSL_Write on all records where VSL_Match returns true. + * + * Return values: + * 0: OK + * !=0: Return value from either VSL_Next or VSL_Write + */ + +int VSL_WriteTransactions(struct VSL_data *vsl, + struct VSL_transaction *ptrans[], void *fo); + /* + * Write all transactions in ptrans using VSL_WriteAll + * Return values: + * 0: OK + * !=0: Return value from either VSL_Next or VSL_Write + */ + struct VSLQ *VSLQ_New(struct VSL_data *vsl, struct VSL_cursor **cp, enum VSL_grouping_e grouping, const char *query); /* diff --git a/lib/libvarnishapi/libvarnishapi.map b/lib/libvarnishapi/libvarnishapi.map index 1e8c513..becda5d 100644 --- a/lib/libvarnishapi/libvarnishapi.map +++ b/lib/libvarnishapi/libvarnishapi.map @@ -106,6 +106,10 @@ LIBVARNISHAPI_1.3 { VSL_PrintTerse; VSL_PrintAll; VSL_PrintTransactions; + VSL_WriteOpen; + VSL_Write; + VSL_WriteAll; + VSL_WriteTransactions; VSLQ_New; VSLQ_Delete; VSLQ_Dispatch; diff --git a/lib/libvarnishapi/vsl.c b/lib/libvarnishapi/vsl.c index a9b1726..28d0e44 100644 --- a/lib/libvarnishapi/vsl.c +++ b/lib/libvarnishapi/vsl.c @@ -320,3 +320,71 @@ VSL_PrintTransactions(struct VSL_data *vsl, struct VSL_transaction *pt[], return (0); } + +FILE* +VSL_WriteOpen(struct VSL_data *vsl, const char *name, int append, int unbuf) +{ + const char head[] = VSL_FILE_ID; + FILE* f; + + f = fopen(name, append ? "a" : "w"); + if (f == NULL) { + vsl_diag(vsl, "%s", strerror(errno)); + return (NULL); + } + if (unbuf) + setbuf(f, NULL); + if (0 == ftell(f)) + fwrite(head, 1, sizeof head, f); + return (f); +} + +int +VSL_Write(struct VSL_data *vsl, const struct VSL_cursor *c, void *fo) +{ + size_t r; + + CHECK_OBJ_NOTNULL(vsl, VSL_MAGIC); + if (c == NULL || c->rec.ptr == NULL) + return (0); + if (fo == NULL) + fo = stdout; + r = fwrite(c->rec.ptr, sizeof *c->rec.ptr, + VSL_NEXT(c->rec.ptr) - c->rec.ptr, fo); + if (r == 0) + return (-5); + return (0); +} + +int +VSL_WriteAll(struct VSL_data *vsl, struct VSL_cursor *c, void *fo) +{ + int i; + + if (c == NULL) + return (0); + while (1) { + i = VSL_Next(c); + if (i <= 0) + return (i); + if (!VSL_Match(vsl, c)) + continue; + i = VSL_Write(vsl, c, fo); + if (i != 0) + return (i); + } +} + +int +VSL_WriteTransactions(struct VSL_data *vsl, struct VSL_transaction *pt[], + void *fo) +{ + struct VSL_transaction *t; + int i; + + if (pt == NULL) + return (0); + for (i = 0, t = pt[0]; i == 0 && t != NULL; t = *++pt) + i = VSL_WriteAll(vsl, t->c, fo); + return (i); +} From martin at varnish-cache.org Thu Jun 13 10:41:23 2013 From: martin at varnish-cache.org (Martin Blix Grydeland) Date: Thu, 13 Jun 2013 12:41:23 +0200 Subject: [master] 0bd8445 Fix up the binary file reading VSL cursor Message-ID: commit 0bd8445a93cb7abd2e26d09836f7225e9675686e Author: Martin Blix Grydeland Date: Wed May 22 10:53:29 2013 +0200 Fix up the binary file reading VSL cursor diff --git a/include/vapi/vsl_int.h b/include/vapi/vsl_int.h index 8765b84..419af70 100644 --- a/include/vapi/vsl_int.h +++ b/include/vapi/vsl_int.h @@ -80,6 +80,7 @@ struct VSL_head { #define VSL_LENMASK 0xffff #define VSL_WORDS(len) (((len) + 3) / 4) +#define VSL_BYTES(words) ((words) * 4) #define VSL_END(ptr, len) ((ptr) + 2 + VSL_WORDS(len)) #define VSL_NEXT(ptr) VSL_END(ptr, VSL_LEN(ptr)) #define VSL_LEN(ptr) ((ptr)[0] & VSL_LENMASK) diff --git a/lib/libvarnishapi/vsl_api.h b/lib/libvarnishapi/vsl_api.h index b2f9fb7..05987a6 100644 --- a/lib/libvarnishapi/vsl_api.h +++ b/lib/libvarnishapi/vsl_api.h @@ -33,7 +33,7 @@ #include "vqueue.h" #include "vapi/vsm.h" -#define VSL_FILE_HEAD "VSL" +#define VSL_FILE_ID "VSL" struct vslc_shmptr { uint32_t *ptr; diff --git a/lib/libvarnishapi/vsl_cursor.c b/lib/libvarnishapi/vsl_cursor.c index 9bee281..443a176 100644 --- a/lib/libvarnishapi/vsl_cursor.c +++ b/lib/libvarnishapi/vsl_cursor.c @@ -325,28 +325,30 @@ vslc_file_next(void *cursor) do { c->c.c.rec.ptr = NULL; - assert(c->buflen >= 2 * 4); - i = vslc_file_readn(c->fd, c->buf, 2 * 4); + assert(c->buflen >= VSL_BYTES(2)); + i = vslc_file_readn(c->fd, c->buf, VSL_BYTES(2)); if (i < 0) return (-4); /* I/O error */ if (i == 0) return (-1); /* EOF */ - assert(i == 2 * 4); - l = (2 + VSL_WORDS(VSL_LEN(c->buf))) * 4; + assert(i == VSL_BYTES(2)); + l = VSL_BYTES(2 + VSL_WORDS(VSL_LEN(c->buf))); if (c->buflen < l) { c->buf = realloc(c->buf, 2 * l); AN(c->buf); c->buflen = 2 * l; } - i = vslc_file_readn(c->fd, c->buf + 2, l - 2 * 4); - if (i < 0) - return (-4); /* I/O error */ - if (i == 0) - return (-1); /* EOF */ - assert(i == l - 2 * 4); + if (l > VSL_BYTES(2)) { + i = vslc_file_readn(c->fd, c->buf + 2, + l - VSL_BYTES(2)); + if (i < 0) + return (-4); /* I/O error */ + if (i == 0) + return (-1); /* EOF */ + assert(i == l - VSL_BYTES(2)); + } c->c.c.rec.ptr = c->buf; - } while (c->c.c.rec.ptr != NULL && - VSL_TAG(c->c.c.rec.ptr) == SLT__Batch); + } while (VSL_TAG(c->c.c.rec.ptr) == SLT__Batch); return (1); } @@ -371,18 +373,19 @@ VSL_CursorFile(struct VSL_data *vsl, const char *name) { struct vslc_file *c; int fd; - char buf[4]; + char buf[] = VSL_FILE_ID; ssize_t i; - if (!strcmp(name, "-")) { + if (!strcmp(name, "-")) + fd = STDIN_FILENO; + else { fd = open(name, O_RDONLY); if (fd < 0) { vsl_diag(vsl, "Could not open %s: %s\n", name, strerror(errno)); return (NULL); } - } else - fd = STDIN_FILENO; + } i = vslc_file_readn(fd, buf, sizeof buf); if (i <= 0) { @@ -393,7 +396,7 @@ VSL_CursorFile(struct VSL_data *vsl, const char *name) return (NULL); } assert(i == sizeof buf); - if (memcmp(buf, VSL_FILE_HEAD, sizeof buf)) { + if (memcmp(buf, VSL_FILE_ID, sizeof buf)) { if (fd > STDIN_FILENO) (void)close(fd); vsl_diag(vsl, "Not a VSL file: %s\n", name); From martin at varnish-cache.org Thu Jun 13 10:41:23 2013 From: martin at varnish-cache.org (Martin Blix Grydeland) Date: Thu, 13 Jun 2013 12:41:23 +0200 Subject: [master] b053835 Enable binary file reading and writing in varnishlog Message-ID: commit b0538359a354be059c11e5d3c9f373b74570a290 Author: Martin Blix Grydeland Date: Wed May 22 10:54:32 2013 +0200 Enable binary file reading and writing in varnishlog diff --git a/bin/varnishlog/varnishlog.c b/bin/varnishlog/varnishlog.c index 1cf1217..db9bcef 100644 --- a/bin/varnishlog/varnishlog.c +++ b/bin/varnishlog/varnishlog.c @@ -67,6 +67,10 @@ main(int argc, char * const *argv) struct VSL_cursor *c; struct VSLQ *q; int i; + int a_opt = 0; + const char *w_arg = NULL; + FILE *fo = stdout; + VSLQ_dispatch_f *func; vut = VUT_New(); AN(vut); @@ -75,12 +79,18 @@ main(int argc, char * const *argv) vsm = VSM_New(); AN(vsm); - while ((opt = getopt(argc, argv, "dg:n:r:v")) != -1) { + while ((opt = getopt(argc, argv, "adg:n:r:vw:")) != -1) { switch (opt) { + case 'a': + a_opt = 1; + break; case 'n': /* Instance name */ if (VSM_n_Arg(vsm, optarg) > 0) break; + case 'w': + w_arg = optarg; + break; default: if (!VSL_Arg(vsl, opt, optarg) && !VUT_Arg(vut, opt, optarg)) @@ -88,21 +98,36 @@ main(int argc, char * const *argv) } } + func = VSL_PrintTransactions; + if (w_arg) { + fo = VSL_WriteOpen(vsl, w_arg, a_opt); + if (fo == NULL) + VUT_Error(1, "-w: %s", VSL_Error(vsl)); + AZ(setvbuf(fo, NULL, _IONBF, 0)); + func = VSL_WriteTransactions; + } + AN(fo); + /* Create cursor */ - if (VSM_Open(vsm)) - VUT_Error(1, "VSM_Open: %s", VSM_Error(vsm)); - c = VSL_CursorVSM(vsl, vsm, !vut->d_opt); + if (vut->r_arg) + c = VSL_CursorFile(vsl, vut->r_arg); + else { + if (VSM_Open(vsm)) + VUT_Error(1, "VSM_Open: %s", VSM_Error(vsm)); + c = VSL_CursorVSM(vsl, vsm, !vut->d_opt); + } if (c == NULL) - VUT_Error(1, "VSL_CursorVSM: %s", VSL_Error(vsl)); + VUT_Error(1, "Can't open log: %s", VSL_Error(vsl)); /* Create query */ q = VSLQ_New(vsl, &c, vut->g_arg, argv[optind]); if (q == NULL) - VUT_Error(1, "VSLQ_New: %s", VSL_Error(vsl)); + VUT_Error(1, "Query error: %s", VSL_Error(vsl)); AZ(c); while (1) { while (q == NULL) { + AZ(vut->r_arg); VTIM_sleep(0.1); if (VSM_Open(vsm)) { VSM_ResetError(vsm); @@ -118,7 +143,7 @@ main(int argc, char * const *argv) AZ(c); } - i = VSLQ_Dispatch(q, VSL_PrintTransactions, stdout); + i = VSLQ_Dispatch(q, func, fo); if (i == 0) { /* Nothing to do but wait */ VTIM_sleep(0.01); @@ -127,7 +152,7 @@ main(int argc, char * const *argv) break; } else if (i <= -2) { /* XXX: Make continuation optional */ - VSLQ_Flush(q, VSL_PrintTransactions, stdout); + VSLQ_Flush(q, func, fo); VSLQ_Delete(&q); AZ(q); if (i == -2) { @@ -144,7 +169,7 @@ main(int argc, char * const *argv) } if (q != NULL) { - VSLQ_Flush(q, VSL_PrintTransactions, stdout); + VSLQ_Flush(q, func, fo); VSLQ_Delete(&q); AZ(q); } diff --git a/doc/sphinx/reference/varnishlog.rst b/doc/sphinx/reference/varnishlog.rst index 2cd4176..1210b24 100644 --- a/doc/sphinx/reference/varnishlog.rst +++ b/doc/sphinx/reference/varnishlog.rst @@ -32,8 +32,6 @@ The following options are available: When writing to a file, append to it rather than overwrite it. - XXX: Not yet implemented - -b Only show backend transactions. If neither -b nor -c is @@ -109,8 +107,6 @@ The following options are available: Read log entries from file instaed of shared memory - XXX: Not yet implemented - -s num Skip the first num log transactions (or log records if @@ -144,7 +140,7 @@ The following options are available: file, it will reopen the file, allowing the old one to be rotated away. - XXX: Not yet implemented + XXX: Log rotation not yet implemented -x tag From martin at varnish-cache.org Thu Jun 13 10:41:24 2013 From: martin at varnish-cache.org (Martin Blix Grydeland) Date: Thu, 13 Jun 2013 12:41:24 +0200 Subject: [master] cf3f0b3 Add a VUT_Main implementation that all the VSL utilities should be able to use. Message-ID: commit cf3f0b3cf782415a9dc6dedf792b42739751cfae Author: Martin Blix Grydeland Date: Thu May 23 10:08:54 2013 +0200 Add a VUT_Main implementation that all the VSL utilities should be able to use. Moved most functionality from varnishlog to VUT. diff --git a/bin/varnishlog/varnishlog.c b/bin/varnishlog/varnishlog.c index db9bcef..a3199f7 100644 --- a/bin/varnishlog/varnishlog.c +++ b/bin/varnishlog/varnishlog.c @@ -61,121 +61,22 @@ main(int argc, char * const *argv) { char opt; - struct VUT *vut; - struct VSL_data *vsl; - struct VSM_data *vsm; - struct VSL_cursor *c; - struct VSLQ *q; - int i; - int a_opt = 0; - const char *w_arg = NULL; - FILE *fo = stdout; - VSLQ_dispatch_f *func; + VUT_Init(); - vut = VUT_New(); - AN(vut); - vsl = VSL_New(); - AN(vsl); - vsm = VSM_New(); - AN(vsm); - - while ((opt = getopt(argc, argv, "adg:n:r:vw:")) != -1) { + while ((opt = getopt(argc, argv, "adg:n:r:uvw:")) != -1) { switch (opt) { - case 'a': - a_opt = 1; - break; - case 'n': - /* Instance name */ - if (VSM_n_Arg(vsm, optarg) > 0) - break; - case 'w': - w_arg = optarg; - break; default: - if (!VSL_Arg(vsl, opt, optarg) && - !VUT_Arg(vut, opt, optarg)) + if (!VUT_Arg(opt, optarg)) usage(); } } - func = VSL_PrintTransactions; - if (w_arg) { - fo = VSL_WriteOpen(vsl, w_arg, a_opt); - if (fo == NULL) - VUT_Error(1, "-w: %s", VSL_Error(vsl)); - AZ(setvbuf(fo, NULL, _IONBF, 0)); - func = VSL_WriteTransactions; - } - AN(fo); - - /* Create cursor */ - if (vut->r_arg) - c = VSL_CursorFile(vsl, vut->r_arg); - else { - if (VSM_Open(vsm)) - VUT_Error(1, "VSM_Open: %s", VSM_Error(vsm)); - c = VSL_CursorVSM(vsl, vsm, !vut->d_opt); - } - if (c == NULL) - VUT_Error(1, "Can't open log: %s", VSL_Error(vsl)); - - /* Create query */ - q = VSLQ_New(vsl, &c, vut->g_arg, argv[optind]); - if (q == NULL) - VUT_Error(1, "Query error: %s", VSL_Error(vsl)); - AZ(c); + if (optind < argc) + VUT.query = argv[optind]; - while (1) { - while (q == NULL) { - AZ(vut->r_arg); - VTIM_sleep(0.1); - if (VSM_Open(vsm)) { - VSM_ResetError(vsm); - continue; - } - c = VSL_CursorVSM(vsl, vsm, 1); - if (c == NULL) { - VSL_ResetError(vsl); - continue; - } - q = VSLQ_New(vsl, &c, vut->g_arg, argv[optind]); - AN(q); - AZ(c); - } - - i = VSLQ_Dispatch(q, func, fo); - if (i == 0) { - /* Nothing to do but wait */ - VTIM_sleep(0.01); - } else if (i == -1) { - /* EOF */ - break; - } else if (i <= -2) { - /* XXX: Make continuation optional */ - VSLQ_Flush(q, func, fo); - VSLQ_Delete(&q); - AZ(q); - if (i == -2) { - /* Abandoned */ - VUT_Error(0, "Log abandoned - reopening"); - VSM_Close(vsm); - } else if (i < -2) { - /* Overrun */ - VUT_Error(0, "Log overrun"); - } - } else { - VUT_Error(1, "Unexpected: %d", i); - } - } - - if (q != NULL) { - VSLQ_Flush(q, func, fo); - VSLQ_Delete(&q); - AZ(q); - } - VSL_Delete(vsl); - VSM_Delete(vsm); - VUT_Delete(&vut); + VUT_Setup(); + VUT_Main(NULL, NULL); + VUT_Fini(); exit(0); } diff --git a/bin/varnishlog/vut.c b/bin/varnishlog/vut.c index b6711ec..29ff56b 100644 --- a/bin/varnishlog/vut.c +++ b/bin/varnishlog/vut.c @@ -41,11 +41,14 @@ #include "vapi/vsm.h" #include "vapi/vsc.h" #include "vapi/vsl.h" +#include "vtim.h" #include "vas.h" #include "miniobj.h" #include "vut.h" +struct VUT VUT; + void VUT_Error(int status, const char *fmt, ...) { @@ -61,52 +64,210 @@ VUT_Error(int status, const char *fmt, ...) exit(status); } -struct VUT* -VUT_New(void) +int +VUT_g_Arg(const char *arg) { - struct VUT *vut; - vut = calloc(1, sizeof *vut); - AN(vut); - vut->g_arg = VSL_g_vxid; + VUT.g_arg = VSLQ_Name2Grouping(arg, -1); + if (VUT.g_arg == -2) + VUT_Error(1, "Ambigous grouping type: %s", arg); + else if (VUT.g_arg < 0) + VUT_Error(1, "Unknown grouping type: %s", arg); + return (1); +} + +int +VUT_Arg(int opt, const char *arg) +{ + switch (opt) { + case 'a': + /* Binary file append */ + VUT.a_opt = 1; + return (1); + case 'n': + /* Varnish instance */ + if (VUT.vsm == NULL) + VUT.vsm = VSM_New(); + AN(VUT.vsm); + if (VSM_n_Arg(VUT.vsm, arg) <= 0) + VUT_Error(1, "%s", VSM_Error(VUT.vsm)); + return (1); + case 'd': + /* Head */ + VUT.d_opt = 1; + return (1); + case 'g': + /* Grouping */ + return (VUT_g_Arg(arg)); + case 'r': + /* Binary file input */ + REPLACE(VUT.r_arg, arg); + return (1); + case 'u': + /* Unbuffered binary output */ + VUT.u_opt = 1; + return (1); + case 'w': + /* Binary file output */ + REPLACE(VUT.w_arg, arg); + return (1); + default: + AN(VUT.vsl); + return (VSL_Arg(VUT.vsl, opt, arg)); + } +} - return (vut); +void +VUT_Init(void) +{ + VUT.g_arg = VSL_g_vxid; + AZ(VUT.vsl); + VUT.vsl = VSL_New(); + AN(VUT.vsl); } void -VUT_Delete(struct VUT **pvut) +VUT_Setup(void) { - struct VUT *vut; + struct VSL_cursor *c; - AN(pvut); - vut = *pvut; - *pvut = NULL; - AN(vut); + AN(VUT.vsl); - free(vut->r_arg); + /* Input */ + if (VUT.r_arg && VUT.vsm) + VUT_Error(1, "Can't have both -n and -r options"); + if (VUT.r_arg) + c = VSL_CursorFile(VUT.vsl, VUT.r_arg); + else { + if (VUT.vsm == NULL) + /* Default uses VSM with n=hostname */ + VUT.vsm = VSM_New(); + AN(VUT.vsm); + if (VSM_Open(VUT.vsm)) + VUT_Error(1, "Can't open VSM file (%s)", + VSM_Error(VUT.vsm)); + c = VSL_CursorVSM(VUT.vsl, VUT.vsm, !VUT.d_opt); + } + if (c == NULL) + VUT_Error(1, "Can't open log (%s)", VSL_Error(VUT.vsl)); + + /* Output */ + if (VUT.w_arg) { + VUT.fo = VSL_WriteOpen(VUT.vsl, VUT.w_arg, VUT.a_opt, + VUT.u_opt); + if (VUT.fo == NULL) + VUT_Error(1, "Can't open output file (%s)", + VSL_Error(VUT.vsl)); + } - free(vut); + /* Create query */ + VUT.vslq = VSLQ_New(VUT.vsl, &c, VUT.g_arg, VUT.query); + if (VUT.vslq == NULL) + VUT_Error(1, "Query parse error (%s)", VSL_Error(VUT.vsl)); + AZ(c); } -int -VUT_g_Arg(struct VUT *vut, const char *arg) +void +VUT_Fini(void) { + free(VUT.r_arg); - vut->g_arg = VSLQ_Name2Grouping(arg, -1); - if (vut->g_arg == -2) - VUT_Error(1, "Ambigous grouping type: %s", arg); - else if (vut->g_arg < 0) - VUT_Error(1, "Unknown grouping type: %s", arg); - return (1); + if (VUT.vslq) + VSLQ_Delete(&VUT.vslq); + if (VUT.vsl) + VSL_Delete(VUT.vsl); + if (VUT.vsm) + VSM_Delete(VUT.vsm); + + memset(&VUT, 0, sizeof VUT); } int -VUT_Arg(struct VUT *vut, int opt, const char *arg) +VUT_Main(VSLQ_dispatch_f *func, void *priv) { - switch (opt) { - case 'd': vut->d_opt = 1; return (1); - case 'g': return (VUT_g_Arg(vut, arg)); - case 'r': REPLACE(vut->r_arg, arg); return (1); - default: return (0); + struct VSL_cursor *c; + int i; + + if (func == NULL) { + if (VUT.w_arg) + func = VSL_WriteTransactions; + else + func = VSL_PrintTransactions; + priv = VUT.fo; } + + while (1) { + while (VUT.vslq == NULL) { + AZ(VUT.r_arg); + AN(VUT.vsm); + VTIM_sleep(0.1); + if (VSM_Open(VUT.vsm)) { + VSM_ResetError(VUT.vsm); + continue; + } + c = VSL_CursorVSM(VUT.vsl, VUT.vsm, 1); + if (c == NULL) { + VSL_ResetError(VUT.vsl); + continue; + } + VUT.vslq = VSLQ_New(VUT.vsl, &c, VUT.g_arg, VUT.query); + AN(VUT.vslq); + AZ(c); + } + + i = VSLQ_Dispatch(VUT.vslq, func, priv); + if (i == 0) { + /* Nothing to do but wait */ + if (VUT.fo) + fflush(VUT.fo); + VTIM_sleep(0.01); + continue; + } + if (i == -1) { + /* EOF */ + break; + } + + if (VUT.vsm == NULL) + break; + + /* XXX: Make continuation optional */ + + VSLQ_Flush(VUT.vslq, func, priv); + VSLQ_Delete(&VUT.vslq); + AZ(VUT.vslq); + + if (i == -2) { + /* Abandoned */ + VUT_Error(0, "Log abandoned - reopening"); + VSM_Close(VUT.vsm); + } else if (i < -2) { + /* Overrun */ + VUT_Error(0, "Log overrun"); + } + + /* Reconnect VSM */ + while (VUT.vslq == NULL) { + AZ(VUT.r_arg); + AN(VUT.vsm); + VTIM_sleep(0.1); + if (VSM_Open(VUT.vsm)) { + VSM_ResetError(VUT.vsm); + continue; + } + c = VSL_CursorVSM(VUT.vsl, VUT.vsm, 1); + if (c == NULL) { + VSL_ResetError(VUT.vsl); + continue; + } + VUT.vslq = VSLQ_New(VUT.vsl, &c, VUT.g_arg, VUT.query); + AN(VUT.vslq); + AZ(c); + } + } + + if (VUT.vslq != NULL) + VSLQ_Flush(VUT.vslq, func, priv); + + return (i); } diff --git a/bin/varnishlog/vut.h b/bin/varnishlog/vut.h index 6d72755..dcb5a77 100644 --- a/bin/varnishlog/vut.h +++ b/bin/varnishlog/vut.h @@ -32,18 +32,35 @@ #include "vdef.h" struct VUT { - int d_opt; - int g_arg; - char *r_arg; + /* Options */ + int a_opt; + int d_opt; + int g_arg; + char *r_arg; + int u_opt; + char *w_arg; + const char *query; + + /* State */ + struct VSL_data *vsl; + struct VSM_data *vsm; + struct VSLQ *vslq; + FILE *fo; }; +extern struct VUT VUT; + void VUT_Error(int status, const char *fmt, ...) __printflike(2, 3); -struct VUT *VUT_New(void); +int VUT_g_Arg(const char *arg); + +int VUT_Arg(int opt, const char *arg); + +void VUT_Setup(void); -void VUT_Delete(struct VUT **pvut); +void VUT_Init(void); -int VUT_g_Arg(struct VUT *vut, const char *arg); +void VUT_Fini(void); -int VUT_Arg(struct VUT *vut, int opt, const char *arg); +int VUT_Main(VSLQ_dispatch_f *func, void *priv); From martin at varnish-cache.org Thu Jun 13 10:41:24 2013 From: martin at varnish-cache.org (Martin Blix Grydeland) Date: Thu, 13 Jun 2013 12:41:24 +0200 Subject: [master] 9e62b49 Document the varnish API options in a single place, using preprocessor tables. Message-ID: commit 9e62b49947fc9d8e9ab2e72bdff8ba02b526125e Author: Martin Blix Grydeland Date: Fri May 24 15:03:52 2013 +0200 Document the varnish API options in a single place, using preprocessor tables. Enhance the usage() output of varnishlog using the new tables. Add API utility header to extract data from the option headers. diff --git a/bin/varnishlog/Makefile.am b/bin/varnishlog/Makefile.am index 77e04a8..500d941 100644 --- a/bin/varnishlog/Makefile.am +++ b/bin/varnishlog/Makefile.am @@ -8,6 +8,7 @@ dist_man_MANS = varnishlog.1 varnishlog_SOURCES = \ varnishlog.c \ + varnishlog_options.h \ vut.c \ vut.h \ $(top_builddir)/lib/libvarnish/vas.c \ diff --git a/bin/varnishlog/varnishlog.c b/bin/varnishlog/varnishlog.c index a3199f7..13075fd 100644 --- a/bin/varnishlog/varnishlog.c +++ b/bin/varnishlog/varnishlog.c @@ -49,10 +49,20 @@ #include "vtim.h" #include "vut.h" +#define VOPT_OPTSTRING +#define VOPT_SYNOPSIS +#define VOPT_USAGE +#define VOPT_INC "varnishlog_options.h" +#include "vapi/voptget.h" + static void usage(void) { - fprintf(stderr, "usage: varnishlog ...\n"); + const char **opt; + fprintf(stderr, "Usage: varnishlog [query expression]\n\n"); + fprintf(stderr, "Options:\n"); + for (opt = vopt_usage; *opt != NULL; opt += 2) + fprintf(stderr, " %-25s %s\n", *opt, *(opt + 1)); exit(1); } @@ -63,7 +73,7 @@ main(int argc, char * const *argv) VUT_Init(); - while ((opt = getopt(argc, argv, "adg:n:r:uvw:")) != -1) { + while ((opt = getopt(argc, argv, vopt_optstring)) != -1) { switch (opt) { default: if (!VUT_Arg(opt, optarg)) diff --git a/bin/varnishlog/varnishlog_options.h b/bin/varnishlog/varnishlog_options.h new file mode 100644 index 0000000..801ca38 --- /dev/null +++ b/bin/varnishlog/varnishlog_options.h @@ -0,0 +1,42 @@ +/*- + * Copyright (c) 2006 Verdens Gang AS + * Copyright (c) 2006-2013 Varnish Software AS + * All rights reserved. + * + * Author: Martin Blix Grydeland + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND + * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE + * ARE DISCLAIMED. IN NO EVENT SHALL AUTHOR OR CONTRIBUTORS BE LIABLE + * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL + * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS + * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) + * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT + * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY + * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF + * SUCH DAMAGE. + */ + +#include "vapi/vapi_options.h" + +VSL_OPT_a +VSL_OPT_d +VSL_OPT_g +VSL_OPT_i +VSM_OPT_n +VSM_OPT_N +VSL_OPT_r +VSL_OPT_u +VSL_OPT_v +VSL_OPT_w +VSL_OPT_x diff --git a/include/Makefile.am b/include/Makefile.am index 3f33a3e..1ad0935 100644 --- a/include/Makefile.am +++ b/include/Makefile.am @@ -31,6 +31,8 @@ nobase_pkginclude_HEADERS = \ vapi/vsc_int.h \ vapi/vsl.h \ vapi/vsl_int.h \ + vapi/voptget.h \ + vapi/vapi_options.h \ vcli.h # Private headers diff --git a/include/vapi/vapi_options.h b/include/vapi/vapi_options.h new file mode 100644 index 0000000..6051308 --- /dev/null +++ b/include/vapi/vapi_options.h @@ -0,0 +1,107 @@ +/*- + * Copyright (c) 2006 Verdens Gang AS + * Copyright (c) 2006-2013 Varnish Software AS + * All rights reserved. + * + * Author: Martin Blix Grydeland + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND + * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE + * ARE DISCLAIMED. IN NO EVENT SHALL AUTHOR OR CONTRIBUTORS BE LIABLE + * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL + * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS + * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) + * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT + * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY + * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF + * SUCH DAMAGE. + */ + +/* VSM options */ + +#define VSM_OPT_n \ + VOPT("n:", "[-n name]", "Varnish instance name", \ + "Specify the name of the varnishd instance to get logs" \ + " from. If -n is not specified, the host name is used." \ + ) + +#define VSM_OPT_N \ + VOPT("N:", "[-N filename]", "VSM filename", \ + "Specify the filename of a stale VSM instance. When using" \ + " this option the abandonment checking is disabled." \ + ) + +/* VSL options */ + +#define VSL_OPT_a \ + VOPT("a", "[-a]", "Append binary file output", \ + "When writing binary output to a file, append to it rather" \ + " than overwrite it." \ + ) + +#define VSL_OPT_d \ + VOPT("d", "[-d]", "Process old log entries on startup", \ + "Start processing log records at the head of the log" \ + " instead of the tail." \ + ) + +#define VSL_OPT_g \ + VOPT("g:", "[-g {session|request|vxid|raw}]", "Grouping mode", \ + "The grouping of the log records. The default is to group" \ + " by request." \ + ) + +#define VSL_OPT_i \ + VOPT("i:", "[-i tag]", "Include tag", \ + "Output only this tag. Multiple -i options may be given." \ + "\n" \ + "If an -i option is the first of any -ix options, all tags" \ + " are disabled before -ix processing." \ + ) + +#define VSL_OPT_r \ + VOPT("r:", "[-r filename]", "Binary file input", \ + "Read log in binary file format from this file." \ + ) + +#define VSL_OPT_u \ + VOPT("u", "[-u]", "Binary file output unbuffered", \ + "Unbuffered binary file output mode." \ + ) + +#define VSL_OPT_v \ + VOPT("v", "[-v]", "Verbose record printing", \ + "Use verbose output on record set printing, giving the" \ + " VXID on every log line. Without this option, the VXID" \ + " will only be given on the header of that transaction." \ + ) + +#define VSL_OPT_w \ + VOPT("w:", "[-w filename]", "Binary output filename", \ + "Write log entries to this file instead of displaying" \ + " them. The file will be overwritten unless the -a option" \ + " was specified. If the application receives a SIGHUP" \ + " while writing to a file, it will reopen the file" \ + " allowing the old one to be rotated away.\n" \ + "\n" \ + "XXX: Log rotation not yet implemented" \ + ) + +#define VSL_OPT_x \ + VOPT("x:", "[-x tag]", "Exclude tag", \ + "Exclude log records of this tag. Multiple -x options" \ + " may be given.\n" \ + "\n" \ + "If an -x option is the first of any -ix options, all tags" \ + " are enabled for output before -ix processing." \ + ) diff --git a/include/vapi/voptget.h b/include/vapi/voptget.h new file mode 100644 index 0000000..443d316 --- /dev/null +++ b/include/vapi/voptget.h @@ -0,0 +1,81 @@ +/*- + * Copyright (c) 2006 Verdens Gang AS + * Copyright (c) 2006-2013 Varnish Software AS + * All rights reserved. + * + * Author: Martin Blix Grydeland + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND + * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE + * ARE DISCLAIMED. IN NO EVENT SHALL AUTHOR OR CONTRIBUTORS BE LIABLE + * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL + * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS + * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) + * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT + * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY + * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF + * SUCH DAMAGE. + */ + +/* + * Legend: VOPT(o,s,r,d,l) where + * o: Option string part + * s: Synopsis + * d: Description + * l: Long description + */ + +#ifdef VOPT_OPTSTRING +#define VOPT(o,s,d,l) o +const char vopt_optstring[] = +#include VOPT_INC + ; +#undef VOPT +#endif + +#ifdef VOPT_SYNOPSIS +#define VOPT(o,s,d,l) " " s +const char vopt_synopsis[] = +#include VOPT_INC + ; +#undef VOPT +#endif + +#ifdef VOPT_USAGE +#define VOPT(o,s,d,l) s, d, +const char *vopt_usage[] = { +#include VOPT_INC + NULL, NULL, +}; +#undef VOPT +#endif + +#ifndef VOPTGET_H +#define VOPTGET_H + +struct vopt_full { + const char *option; + const char *synopsis; + const char *desc; + const char *ldesc; +}; + +#endif + +#ifdef VOPT_FULL +#define VOPT(o,s,d,l) { o,s,d,l }, +const struct vopt_full vopt_full[] = { +#include VOPT_INC +}; +#undef VOPT +#endif From martin at varnish-cache.org Thu Jun 13 10:41:24 2013 From: martin at varnish-cache.org (Martin Blix Grydeland) Date: Thu, 13 Jun 2013 12:41:24 +0200 Subject: [master] 6635286 Make the option documentation be generated from the option tables. Message-ID: commit 66352862f15cf08fc4ac433293af00d0b81acb7a Author: Martin Blix Grydeland Date: Fri May 24 15:22:32 2013 +0200 Make the option documentation be generated from the option tables. diff --git a/bin/varnishlog/Makefile.am b/bin/varnishlog/Makefile.am index 500d941..f8a63af 100644 --- a/bin/varnishlog/Makefile.am +++ b/bin/varnishlog/Makefile.am @@ -23,9 +23,23 @@ varnishlog_LDADD = \ $(top_builddir)/lib/libvarnishapi/libvarnishapi.la \ ${RT_LIBS} ${LIBM} ${PTHREAD_LIBS} -varnishlog.1: $(top_srcdir)/doc/sphinx/reference/varnishlog.rst +noinst_PROGRAMS = varnishlog_opt2rst +varnishlog_opt2rst_CFLAGS = -DOPT2RST_INC="varnishlog_options.h" +varnishlog_opt2rst_SOURCES = \ + opt2rst.c + +BUILT_SOURCES = varnishlog_options.rst + +EXTRA_DIST = varnishlog_options.rst + +varnishlog_options.rst: varnishlog_opt2rst + ./varnishlog_opt2rst > $@ + +varnishlog.1: \ + $(top_srcdir)/doc/sphinx/reference/varnishlog.rst \ + varnishlog_options.rst if HAVE_RST2MAN - ${RST2MAN} $? $@ + ${RST2MAN} $< $@ else @echo "========================================" @echo "You need rst2man installed to make dist" diff --git a/bin/varnishlog/opt2rst.c b/bin/varnishlog/opt2rst.c new file mode 100644 index 0000000..ab854ef --- /dev/null +++ b/bin/varnishlog/opt2rst.c @@ -0,0 +1,89 @@ +/*- + * Copyright (c) 2006 Verdens Gang AS + * Copyright (c) 2006-2013 Varnish Software AS + * All rights reserved. + * + * Author: Martin Blix Grydeland + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND + * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE + * ARE DISCLAIMED. IN NO EVENT SHALL AUTHOR OR CONTRIBUTORS BE LIABLE + * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL + * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS + * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) + * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT + * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY + * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF + * SUCH DAMAGE. + */ + +#include +#include + +#ifndef OPT2RST_INC +#error "OPT2RST_INC undefined" +#endif + +#define STRINGIFY(x) #x +#define TOSTRING(x) STRINGIFY(x) + +#define VOPT_SYNOPSIS +#define VOPT_FULL +#define VOPT_INC TOSTRING(OPT2RST_INC) +#include "vapi/voptget.h" + +static void +print_nobrackets(const char *s) +{ + for (; *s; s++) { + if (strchr("[]", *s)) + continue; + printf("%c", *s); + } +} + +static void +print_tabbed(const char *string, int tabs) +{ + int i; + const char *c; + + for (c = string; *c; c++) { + if (c == string || *(c - 1) == '\n') + for (i = 0; i < tabs; i++) + printf("\t"); + printf("%c", *c); + } +} + +static void +print_opt(const struct vopt_full *opt) +{ + print_nobrackets(opt->synopsis); + printf("\n\n"); + print_tabbed(opt->ldesc, 1); + printf("\n\n"); +} + +int +main(int argc, char * const *argv) +{ + int i; + + (void)argc; + (void)argv; + for (i = 0; i < sizeof vopt_full / sizeof vopt_full[0]; i++) + print_opt(&vopt_full[i]); + + return (0); +} diff --git a/doc/sphinx/reference/varnishlog.rst b/doc/sphinx/reference/varnishlog.rst index 1210b24..3e761f8 100644 --- a/doc/sphinx/reference/varnishlog.rst +++ b/doc/sphinx/reference/varnishlog.rst @@ -28,9 +28,7 @@ OPTIONS The following options are available: --a - - When writing to a file, append to it rather than overwrite it. +.. include:: ../../../bin/varnishlog/varnishlog_options.rst -b @@ -58,24 +56,6 @@ The following options are available: XXX: Not yet implemented --d - - Process old log entries on startup. Nomally, varnishlog will - only process entries which are written to the log after it - starts. - --g {session|request|vxid|raw} - - The grouping of the log records. The default is to group by - request. - --i tag - - Output only this tag. Multiple -i options may be given. - - If an -i option is the first of any -ix options, all tags are - disabled for output before -ix processing. - -I [tag:]regex Output only records matching this regular expression. If tag @@ -91,22 +71,12 @@ The following options are available: XXX: Not yet implemented --n - - Specifies the name of the varnishd instance to get logs - from. If -n is not specified, the host name is used. - - -P file Write the process' PID to the specified file. XXX: Not yet implemented --r file - - Read log entries from file instaed of shared memory - -s num Skip the first num log transactions (or log records if @@ -120,35 +90,12 @@ The following options are available: XXX: Not yet implemented --v - - Use verbose output on set output, giving the VXID on every log - line. Without this option, the VXID will only be given on the - header of that transaction. - -V Display the version number and exit. XXX: Not yet implemented --w file - - Write log entries to file instead of displaying them. The - file will be overwritten unless the -a option was - specified. If varnishlog receives a SIGHUP while writing to a - file, it will reopen the file, allowing the old one to be - rotated away. - - XXX: Log rotation not yet implemented - --x tag - - Exclude log records of this tag. Multiple -x options may be - given. - - If an -x option is the first of any -ix options, all tags are - enabled for output before -ix processing. -X [tag:]regex From martin at varnish-cache.org Thu Jun 13 10:41:24 2013 From: martin at varnish-cache.org (Martin Blix Grydeland) Date: Thu, 13 Jun 2013 12:41:24 +0200 Subject: [master] 932217f Add a lib/libvarnishtools library to hold common files for the tools. Message-ID: commit 932217f75cd96e534805d09730d236e50b395f81 Author: Martin Blix Grydeland Date: Fri May 24 16:04:03 2013 +0200 Add a lib/libvarnishtools library to hold common files for the tools. Move vut.c and opt2rst.c there. diff --git a/bin/varnishlog/Makefile.am b/bin/varnishlog/Makefile.am index f8a63af..2380fb9 100644 --- a/bin/varnishlog/Makefile.am +++ b/bin/varnishlog/Makefile.am @@ -9,8 +9,7 @@ dist_man_MANS = varnishlog.1 varnishlog_SOURCES = \ varnishlog.c \ varnishlog_options.h \ - vut.c \ - vut.h \ + $(top_builddir)/lib/libvarnishtools/vut.c \ $(top_builddir)/lib/libvarnish/vas.c \ $(top_builddir)/lib/libvarnish/flopen.c \ $(top_builddir)/lib/libvarnish/version.c \ @@ -26,7 +25,7 @@ varnishlog_LDADD = \ noinst_PROGRAMS = varnishlog_opt2rst varnishlog_opt2rst_CFLAGS = -DOPT2RST_INC="varnishlog_options.h" varnishlog_opt2rst_SOURCES = \ - opt2rst.c + $(top_builddir)/lib/libvarnishtools/opt2rst.c BUILT_SOURCES = varnishlog_options.rst diff --git a/bin/varnishlog/opt2rst.c b/bin/varnishlog/opt2rst.c deleted file mode 100644 index ab854ef..0000000 --- a/bin/varnishlog/opt2rst.c +++ /dev/null @@ -1,89 +0,0 @@ -/*- - * Copyright (c) 2006 Verdens Gang AS - * Copyright (c) 2006-2013 Varnish Software AS - * All rights reserved. - * - * Author: Martin Blix Grydeland - * - * Redistribution and use in source and binary forms, with or without - * modification, are permitted provided that the following conditions - * are met: - * 1. Redistributions of source code must retain the above copyright - * notice, this list of conditions and the following disclaimer. - * 2. Redistributions in binary form must reproduce the above copyright - * notice, this list of conditions and the following disclaimer in the - * documentation and/or other materials provided with the distribution. - * - * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND - * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE - * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE - * ARE DISCLAIMED. IN NO EVENT SHALL AUTHOR OR CONTRIBUTORS BE LIABLE - * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL - * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS - * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) - * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT - * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY - * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF - * SUCH DAMAGE. - */ - -#include -#include - -#ifndef OPT2RST_INC -#error "OPT2RST_INC undefined" -#endif - -#define STRINGIFY(x) #x -#define TOSTRING(x) STRINGIFY(x) - -#define VOPT_SYNOPSIS -#define VOPT_FULL -#define VOPT_INC TOSTRING(OPT2RST_INC) -#include "vapi/voptget.h" - -static void -print_nobrackets(const char *s) -{ - for (; *s; s++) { - if (strchr("[]", *s)) - continue; - printf("%c", *s); - } -} - -static void -print_tabbed(const char *string, int tabs) -{ - int i; - const char *c; - - for (c = string; *c; c++) { - if (c == string || *(c - 1) == '\n') - for (i = 0; i < tabs; i++) - printf("\t"); - printf("%c", *c); - } -} - -static void -print_opt(const struct vopt_full *opt) -{ - print_nobrackets(opt->synopsis); - printf("\n\n"); - print_tabbed(opt->ldesc, 1); - printf("\n\n"); -} - -int -main(int argc, char * const *argv) -{ - int i; - - (void)argc; - (void)argv; - for (i = 0; i < sizeof vopt_full / sizeof vopt_full[0]; i++) - print_opt(&vopt_full[i]); - - return (0); -} diff --git a/bin/varnishlog/vut.c b/bin/varnishlog/vut.c deleted file mode 100644 index 29ff56b..0000000 --- a/bin/varnishlog/vut.c +++ /dev/null @@ -1,273 +0,0 @@ -/*- - * Copyright (c) 2006 Verdens Gang AS - * Copyright (c) 2006-2013 Varnish Software AS - * All rights reserved. - * - * Author: Martin Blix Grydeland - * - * Redistribution and use in source and binary forms, with or without - * modification, are permitted provided that the following conditions - * are met: - * 1. Redistributions of source code must retain the above copyright - * notice, this list of conditions and the following disclaimer. - * 2. Redistributions in binary form must reproduce the above copyright - * notice, this list of conditions and the following disclaimer in the - * documentation and/or other materials provided with the distribution. - * - * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND - * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE - * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE - * ARE DISCLAIMED. IN NO EVENT SHALL AUTHOR OR CONTRIBUTORS BE LIABLE - * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL - * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS - * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) - * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT - * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY - * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF - * SUCH DAMAGE. - * - * Common functions for the utilities - */ - -#include -#include -#include -#include -#include -#include -#include - -#include "compat/daemon.h" -#include "vapi/vsm.h" -#include "vapi/vsc.h" -#include "vapi/vsl.h" -#include "vtim.h" -#include "vas.h" -#include "miniobj.h" - -#include "vut.h" - -struct VUT VUT; - -void -VUT_Error(int status, const char *fmt, ...) -{ - va_list ap; - - AN(fmt); - va_start(ap, fmt); - vfprintf(stderr, fmt, ap); - va_end(ap); - fprintf(stderr, "\n"); - - if (status) - exit(status); -} - -int -VUT_g_Arg(const char *arg) -{ - - VUT.g_arg = VSLQ_Name2Grouping(arg, -1); - if (VUT.g_arg == -2) - VUT_Error(1, "Ambigous grouping type: %s", arg); - else if (VUT.g_arg < 0) - VUT_Error(1, "Unknown grouping type: %s", arg); - return (1); -} - -int -VUT_Arg(int opt, const char *arg) -{ - switch (opt) { - case 'a': - /* Binary file append */ - VUT.a_opt = 1; - return (1); - case 'n': - /* Varnish instance */ - if (VUT.vsm == NULL) - VUT.vsm = VSM_New(); - AN(VUT.vsm); - if (VSM_n_Arg(VUT.vsm, arg) <= 0) - VUT_Error(1, "%s", VSM_Error(VUT.vsm)); - return (1); - case 'd': - /* Head */ - VUT.d_opt = 1; - return (1); - case 'g': - /* Grouping */ - return (VUT_g_Arg(arg)); - case 'r': - /* Binary file input */ - REPLACE(VUT.r_arg, arg); - return (1); - case 'u': - /* Unbuffered binary output */ - VUT.u_opt = 1; - return (1); - case 'w': - /* Binary file output */ - REPLACE(VUT.w_arg, arg); - return (1); - default: - AN(VUT.vsl); - return (VSL_Arg(VUT.vsl, opt, arg)); - } -} - -void -VUT_Init(void) -{ - VUT.g_arg = VSL_g_vxid; - AZ(VUT.vsl); - VUT.vsl = VSL_New(); - AN(VUT.vsl); -} - -void -VUT_Setup(void) -{ - struct VSL_cursor *c; - - AN(VUT.vsl); - - /* Input */ - if (VUT.r_arg && VUT.vsm) - VUT_Error(1, "Can't have both -n and -r options"); - if (VUT.r_arg) - c = VSL_CursorFile(VUT.vsl, VUT.r_arg); - else { - if (VUT.vsm == NULL) - /* Default uses VSM with n=hostname */ - VUT.vsm = VSM_New(); - AN(VUT.vsm); - if (VSM_Open(VUT.vsm)) - VUT_Error(1, "Can't open VSM file (%s)", - VSM_Error(VUT.vsm)); - c = VSL_CursorVSM(VUT.vsl, VUT.vsm, !VUT.d_opt); - } - if (c == NULL) - VUT_Error(1, "Can't open log (%s)", VSL_Error(VUT.vsl)); - - /* Output */ - if (VUT.w_arg) { - VUT.fo = VSL_WriteOpen(VUT.vsl, VUT.w_arg, VUT.a_opt, - VUT.u_opt); - if (VUT.fo == NULL) - VUT_Error(1, "Can't open output file (%s)", - VSL_Error(VUT.vsl)); - } - - /* Create query */ - VUT.vslq = VSLQ_New(VUT.vsl, &c, VUT.g_arg, VUT.query); - if (VUT.vslq == NULL) - VUT_Error(1, "Query parse error (%s)", VSL_Error(VUT.vsl)); - AZ(c); -} - -void -VUT_Fini(void) -{ - free(VUT.r_arg); - - if (VUT.vslq) - VSLQ_Delete(&VUT.vslq); - if (VUT.vsl) - VSL_Delete(VUT.vsl); - if (VUT.vsm) - VSM_Delete(VUT.vsm); - - memset(&VUT, 0, sizeof VUT); -} - -int -VUT_Main(VSLQ_dispatch_f *func, void *priv) -{ - struct VSL_cursor *c; - int i; - - if (func == NULL) { - if (VUT.w_arg) - func = VSL_WriteTransactions; - else - func = VSL_PrintTransactions; - priv = VUT.fo; - } - - while (1) { - while (VUT.vslq == NULL) { - AZ(VUT.r_arg); - AN(VUT.vsm); - VTIM_sleep(0.1); - if (VSM_Open(VUT.vsm)) { - VSM_ResetError(VUT.vsm); - continue; - } - c = VSL_CursorVSM(VUT.vsl, VUT.vsm, 1); - if (c == NULL) { - VSL_ResetError(VUT.vsl); - continue; - } - VUT.vslq = VSLQ_New(VUT.vsl, &c, VUT.g_arg, VUT.query); - AN(VUT.vslq); - AZ(c); - } - - i = VSLQ_Dispatch(VUT.vslq, func, priv); - if (i == 0) { - /* Nothing to do but wait */ - if (VUT.fo) - fflush(VUT.fo); - VTIM_sleep(0.01); - continue; - } - if (i == -1) { - /* EOF */ - break; - } - - if (VUT.vsm == NULL) - break; - - /* XXX: Make continuation optional */ - - VSLQ_Flush(VUT.vslq, func, priv); - VSLQ_Delete(&VUT.vslq); - AZ(VUT.vslq); - - if (i == -2) { - /* Abandoned */ - VUT_Error(0, "Log abandoned - reopening"); - VSM_Close(VUT.vsm); - } else if (i < -2) { - /* Overrun */ - VUT_Error(0, "Log overrun"); - } - - /* Reconnect VSM */ - while (VUT.vslq == NULL) { - AZ(VUT.r_arg); - AN(VUT.vsm); - VTIM_sleep(0.1); - if (VSM_Open(VUT.vsm)) { - VSM_ResetError(VUT.vsm); - continue; - } - c = VSL_CursorVSM(VUT.vsl, VUT.vsm, 1); - if (c == NULL) { - VSL_ResetError(VUT.vsl); - continue; - } - VUT.vslq = VSLQ_New(VUT.vsl, &c, VUT.g_arg, VUT.query); - AN(VUT.vslq); - AZ(c); - } - } - - if (VUT.vslq != NULL) - VSLQ_Flush(VUT.vslq, func, priv); - - return (i); -} diff --git a/bin/varnishlog/vut.h b/bin/varnishlog/vut.h deleted file mode 100644 index dcb5a77..0000000 --- a/bin/varnishlog/vut.h +++ /dev/null @@ -1,66 +0,0 @@ -/*- - * Copyright (c) 2006 Verdens Gang AS - * Copyright (c) 2006-2013 Varnish Software AS - * All rights reserved. - * - * Author: Martin Blix Grydeland - * - * Redistribution and use in source and binary forms, with or without - * modification, are permitted provided that the following conditions - * are met: - * 1. Redistributions of source code must retain the above copyright - * notice, this list of conditions and the following disclaimer. - * 2. Redistributions in binary form must reproduce the above copyright - * notice, this list of conditions and the following disclaimer in the - * documentation and/or other materials provided with the distribution. - * - * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND - * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE - * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE - * ARE DISCLAIMED. IN NO EVENT SHALL AUTHOR OR CONTRIBUTORS BE LIABLE - * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL - * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS - * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) - * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT - * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY - * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF - * SUCH DAMAGE. - * - * Common functions for the utilities - */ - -#include "vdef.h" - -struct VUT { - /* Options */ - int a_opt; - int d_opt; - int g_arg; - char *r_arg; - int u_opt; - char *w_arg; - const char *query; - - /* State */ - struct VSL_data *vsl; - struct VSM_data *vsm; - struct VSLQ *vslq; - FILE *fo; -}; - -extern struct VUT VUT; - -void VUT_Error(int status, const char *fmt, ...) - __printflike(2, 3); - -int VUT_g_Arg(const char *arg); - -int VUT_Arg(int opt, const char *arg); - -void VUT_Setup(void); - -void VUT_Init(void); - -void VUT_Fini(void); - -int VUT_Main(VSLQ_dispatch_f *func, void *priv); diff --git a/configure.ac b/configure.ac index 3c9add4..f2efa37 100644 --- a/configure.ac +++ b/configure.ac @@ -588,6 +588,7 @@ AC_CONFIG_FILES([ lib/Makefile lib/libvarnish/Makefile lib/libvarnishapi/Makefile + lib/libvarnishtools/Makefile lib/libvarnishcompat/Makefile lib/libvcl/Makefile lib/libvgz/Makefile diff --git a/include/Makefile.am b/include/Makefile.am index 1ad0935..bc55f85 100644 --- a/include/Makefile.am +++ b/include/Makefile.am @@ -61,7 +61,8 @@ nobase_noinst_HEADERS = \ vss.h \ vtcp.h \ vtim.h \ - vtree.h + vtree.h \ + vut.h # Headers for use with vmods pkgdataincludedir = $(pkgdatadir)/include diff --git a/include/vut.h b/include/vut.h new file mode 100644 index 0000000..dcb5a77 --- /dev/null +++ b/include/vut.h @@ -0,0 +1,66 @@ +/*- + * Copyright (c) 2006 Verdens Gang AS + * Copyright (c) 2006-2013 Varnish Software AS + * All rights reserved. + * + * Author: Martin Blix Grydeland + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND + * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE + * ARE DISCLAIMED. IN NO EVENT SHALL AUTHOR OR CONTRIBUTORS BE LIABLE + * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL + * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS + * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) + * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT + * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY + * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF + * SUCH DAMAGE. + * + * Common functions for the utilities + */ + +#include "vdef.h" + +struct VUT { + /* Options */ + int a_opt; + int d_opt; + int g_arg; + char *r_arg; + int u_opt; + char *w_arg; + const char *query; + + /* State */ + struct VSL_data *vsl; + struct VSM_data *vsm; + struct VSLQ *vslq; + FILE *fo; +}; + +extern struct VUT VUT; + +void VUT_Error(int status, const char *fmt, ...) + __printflike(2, 3); + +int VUT_g_Arg(const char *arg); + +int VUT_Arg(int opt, const char *arg); + +void VUT_Setup(void); + +void VUT_Init(void); + +void VUT_Fini(void); + +int VUT_Main(VSLQ_dispatch_f *func, void *priv); diff --git a/lib/Makefile.am b/lib/Makefile.am index 77c5ac5..eb67436 100644 --- a/lib/Makefile.am +++ b/lib/Makefile.am @@ -4,6 +4,7 @@ SUBDIRS = \ libvarnishcompat \ libvarnish \ libvarnishapi \ + libvarnishtools \ libvcl \ libvgz \ libvmod_debug \ @@ -14,6 +15,7 @@ DIST_SUBDIRS = \ libvarnishcompat \ libvarnish \ libvarnishapi \ + libvarnishtools \ libvcl \ libvgz \ libvmod_debug \ diff --git a/lib/libvarnishtools/Makefile.am b/lib/libvarnishtools/Makefile.am new file mode 100644 index 0000000..1a14212 --- /dev/null +++ b/lib/libvarnishtools/Makefile.am @@ -0,0 +1,5 @@ +# + +EXTRA_DIST = \ + vut.c \ + opt2rst.c diff --git a/lib/libvarnishtools/opt2rst.c b/lib/libvarnishtools/opt2rst.c new file mode 100644 index 0000000..ab854ef --- /dev/null +++ b/lib/libvarnishtools/opt2rst.c @@ -0,0 +1,89 @@ +/*- + * Copyright (c) 2006 Verdens Gang AS + * Copyright (c) 2006-2013 Varnish Software AS + * All rights reserved. + * + * Author: Martin Blix Grydeland + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND + * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE + * ARE DISCLAIMED. IN NO EVENT SHALL AUTHOR OR CONTRIBUTORS BE LIABLE + * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL + * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS + * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) + * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT + * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY + * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF + * SUCH DAMAGE. + */ + +#include +#include + +#ifndef OPT2RST_INC +#error "OPT2RST_INC undefined" +#endif + +#define STRINGIFY(x) #x +#define TOSTRING(x) STRINGIFY(x) + +#define VOPT_SYNOPSIS +#define VOPT_FULL +#define VOPT_INC TOSTRING(OPT2RST_INC) +#include "vapi/voptget.h" + +static void +print_nobrackets(const char *s) +{ + for (; *s; s++) { + if (strchr("[]", *s)) + continue; + printf("%c", *s); + } +} + +static void +print_tabbed(const char *string, int tabs) +{ + int i; + const char *c; + + for (c = string; *c; c++) { + if (c == string || *(c - 1) == '\n') + for (i = 0; i < tabs; i++) + printf("\t"); + printf("%c", *c); + } +} + +static void +print_opt(const struct vopt_full *opt) +{ + print_nobrackets(opt->synopsis); + printf("\n\n"); + print_tabbed(opt->ldesc, 1); + printf("\n\n"); +} + +int +main(int argc, char * const *argv) +{ + int i; + + (void)argc; + (void)argv; + for (i = 0; i < sizeof vopt_full / sizeof vopt_full[0]; i++) + print_opt(&vopt_full[i]); + + return (0); +} diff --git a/lib/libvarnishtools/vut.c b/lib/libvarnishtools/vut.c new file mode 100644 index 0000000..29ff56b --- /dev/null +++ b/lib/libvarnishtools/vut.c @@ -0,0 +1,273 @@ +/*- + * Copyright (c) 2006 Verdens Gang AS + * Copyright (c) 2006-2013 Varnish Software AS + * All rights reserved. + * + * Author: Martin Blix Grydeland + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND + * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE + * ARE DISCLAIMED. IN NO EVENT SHALL AUTHOR OR CONTRIBUTORS BE LIABLE + * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL + * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS + * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) + * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT + * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY + * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF + * SUCH DAMAGE. + * + * Common functions for the utilities + */ + +#include +#include +#include +#include +#include +#include +#include + +#include "compat/daemon.h" +#include "vapi/vsm.h" +#include "vapi/vsc.h" +#include "vapi/vsl.h" +#include "vtim.h" +#include "vas.h" +#include "miniobj.h" + +#include "vut.h" + +struct VUT VUT; + +void +VUT_Error(int status, const char *fmt, ...) +{ + va_list ap; + + AN(fmt); + va_start(ap, fmt); + vfprintf(stderr, fmt, ap); + va_end(ap); + fprintf(stderr, "\n"); + + if (status) + exit(status); +} + +int +VUT_g_Arg(const char *arg) +{ + + VUT.g_arg = VSLQ_Name2Grouping(arg, -1); + if (VUT.g_arg == -2) + VUT_Error(1, "Ambigous grouping type: %s", arg); + else if (VUT.g_arg < 0) + VUT_Error(1, "Unknown grouping type: %s", arg); + return (1); +} + +int +VUT_Arg(int opt, const char *arg) +{ + switch (opt) { + case 'a': + /* Binary file append */ + VUT.a_opt = 1; + return (1); + case 'n': + /* Varnish instance */ + if (VUT.vsm == NULL) + VUT.vsm = VSM_New(); + AN(VUT.vsm); + if (VSM_n_Arg(VUT.vsm, arg) <= 0) + VUT_Error(1, "%s", VSM_Error(VUT.vsm)); + return (1); + case 'd': + /* Head */ + VUT.d_opt = 1; + return (1); + case 'g': + /* Grouping */ + return (VUT_g_Arg(arg)); + case 'r': + /* Binary file input */ + REPLACE(VUT.r_arg, arg); + return (1); + case 'u': + /* Unbuffered binary output */ + VUT.u_opt = 1; + return (1); + case 'w': + /* Binary file output */ + REPLACE(VUT.w_arg, arg); + return (1); + default: + AN(VUT.vsl); + return (VSL_Arg(VUT.vsl, opt, arg)); + } +} + +void +VUT_Init(void) +{ + VUT.g_arg = VSL_g_vxid; + AZ(VUT.vsl); + VUT.vsl = VSL_New(); + AN(VUT.vsl); +} + +void +VUT_Setup(void) +{ + struct VSL_cursor *c; + + AN(VUT.vsl); + + /* Input */ + if (VUT.r_arg && VUT.vsm) + VUT_Error(1, "Can't have both -n and -r options"); + if (VUT.r_arg) + c = VSL_CursorFile(VUT.vsl, VUT.r_arg); + else { + if (VUT.vsm == NULL) + /* Default uses VSM with n=hostname */ + VUT.vsm = VSM_New(); + AN(VUT.vsm); + if (VSM_Open(VUT.vsm)) + VUT_Error(1, "Can't open VSM file (%s)", + VSM_Error(VUT.vsm)); + c = VSL_CursorVSM(VUT.vsl, VUT.vsm, !VUT.d_opt); + } + if (c == NULL) + VUT_Error(1, "Can't open log (%s)", VSL_Error(VUT.vsl)); + + /* Output */ + if (VUT.w_arg) { + VUT.fo = VSL_WriteOpen(VUT.vsl, VUT.w_arg, VUT.a_opt, + VUT.u_opt); + if (VUT.fo == NULL) + VUT_Error(1, "Can't open output file (%s)", + VSL_Error(VUT.vsl)); + } + + /* Create query */ + VUT.vslq = VSLQ_New(VUT.vsl, &c, VUT.g_arg, VUT.query); + if (VUT.vslq == NULL) + VUT_Error(1, "Query parse error (%s)", VSL_Error(VUT.vsl)); + AZ(c); +} + +void +VUT_Fini(void) +{ + free(VUT.r_arg); + + if (VUT.vslq) + VSLQ_Delete(&VUT.vslq); + if (VUT.vsl) + VSL_Delete(VUT.vsl); + if (VUT.vsm) + VSM_Delete(VUT.vsm); + + memset(&VUT, 0, sizeof VUT); +} + +int +VUT_Main(VSLQ_dispatch_f *func, void *priv) +{ + struct VSL_cursor *c; + int i; + + if (func == NULL) { + if (VUT.w_arg) + func = VSL_WriteTransactions; + else + func = VSL_PrintTransactions; + priv = VUT.fo; + } + + while (1) { + while (VUT.vslq == NULL) { + AZ(VUT.r_arg); + AN(VUT.vsm); + VTIM_sleep(0.1); + if (VSM_Open(VUT.vsm)) { + VSM_ResetError(VUT.vsm); + continue; + } + c = VSL_CursorVSM(VUT.vsl, VUT.vsm, 1); + if (c == NULL) { + VSL_ResetError(VUT.vsl); + continue; + } + VUT.vslq = VSLQ_New(VUT.vsl, &c, VUT.g_arg, VUT.query); + AN(VUT.vslq); + AZ(c); + } + + i = VSLQ_Dispatch(VUT.vslq, func, priv); + if (i == 0) { + /* Nothing to do but wait */ + if (VUT.fo) + fflush(VUT.fo); + VTIM_sleep(0.01); + continue; + } + if (i == -1) { + /* EOF */ + break; + } + + if (VUT.vsm == NULL) + break; + + /* XXX: Make continuation optional */ + + VSLQ_Flush(VUT.vslq, func, priv); + VSLQ_Delete(&VUT.vslq); + AZ(VUT.vslq); + + if (i == -2) { + /* Abandoned */ + VUT_Error(0, "Log abandoned - reopening"); + VSM_Close(VUT.vsm); + } else if (i < -2) { + /* Overrun */ + VUT_Error(0, "Log overrun"); + } + + /* Reconnect VSM */ + while (VUT.vslq == NULL) { + AZ(VUT.r_arg); + AN(VUT.vsm); + VTIM_sleep(0.1); + if (VSM_Open(VUT.vsm)) { + VSM_ResetError(VUT.vsm); + continue; + } + c = VSL_CursorVSM(VUT.vsl, VUT.vsm, 1); + if (c == NULL) { + VSL_ResetError(VUT.vsl); + continue; + } + VUT.vslq = VSLQ_New(VUT.vsl, &c, VUT.g_arg, VUT.query); + AN(VUT.vslq); + AZ(c); + } + } + + if (VUT.vslq != NULL) + VSLQ_Flush(VUT.vslq, func, priv); + + return (i); +} From martin at varnish-cache.org Thu Jun 13 10:41:24 2013 From: martin at varnish-cache.org (Martin Blix Grydeland) Date: Thu, 13 Jun 2013 12:41:24 +0200 Subject: [master] c521864 Let the synopsis be generated from the option table files Message-ID: commit c521864292f2119632c8b834650144d41c61ff2b Author: Martin Blix Grydeland Date: Wed May 29 11:42:24 2013 +0200 Let the synopsis be generated from the option table files diff --git a/bin/varnishlog/Makefile.am b/bin/varnishlog/Makefile.am index 2380fb9..d9971fc 100644 --- a/bin/varnishlog/Makefile.am +++ b/bin/varnishlog/Makefile.am @@ -25,18 +25,23 @@ varnishlog_LDADD = \ noinst_PROGRAMS = varnishlog_opt2rst varnishlog_opt2rst_CFLAGS = -DOPT2RST_INC="varnishlog_options.h" varnishlog_opt2rst_SOURCES = \ + varnishlog_options.h \ $(top_builddir)/lib/libvarnishtools/opt2rst.c -BUILT_SOURCES = varnishlog_options.rst +BUILT_SOURCES = varnishlog_options.rst varnishlog_synopsis.rst -EXTRA_DIST = varnishlog_options.rst +EXTRA_DIST = varnishlog_options.rst varnishlog_synopsis.rst +MAINTAINERCLEANFILES = $(EXTRA_DIST) -varnishlog_options.rst: varnishlog_opt2rst - ./varnishlog_opt2rst > $@ +varnishlog_options.rst: varnishlog_options.h | varnishlog_opt2rst + ./varnishlog_opt2rst options > $@ +varnishlog_synopsis.rst: varnishlog_options.h | varnishlog_opt2rst + ./varnishlog_opt2rst synopsis > $@ varnishlog.1: \ $(top_srcdir)/doc/sphinx/reference/varnishlog.rst \ - varnishlog_options.rst + varnishlog_options.rst \ + varnishlog_synopsis.rst if HAVE_RST2MAN ${RST2MAN} $< $@ else diff --git a/doc/sphinx/reference/varnishlog.rst b/doc/sphinx/reference/varnishlog.rst index 3e761f8..a41e56d 100644 --- a/doc/sphinx/reference/varnishlog.rst +++ b/doc/sphinx/reference/varnishlog.rst @@ -19,9 +19,8 @@ Display Varnish logs SYNOPSIS ======== -varnishlog [-a] [-b] [-c] [-C] [-d] [-D] [-i tag] [-I [tag:]regex] [-k -keep] [-n varnish_name] [-P file] [-r file] [--raw] [-s num] [-S] [-u] -[-v] [-V] [-w file] [-x tag] [-X [tag:]regex] +.. include:: ../../../bin/varnishlog/varnishlog_synopsis.rst +varnishlog |synopsis| OPTIONS ======= diff --git a/lib/libvarnishtools/opt2rst.c b/lib/libvarnishtools/opt2rst.c index ab854ef..3ad60db 100644 --- a/lib/libvarnishtools/opt2rst.c +++ b/lib/libvarnishtools/opt2rst.c @@ -27,6 +27,7 @@ * SUCH DAMAGE. */ +#include #include #include @@ -75,15 +76,27 @@ print_opt(const struct vopt_full *opt) printf("\n\n"); } +static void +usage(void) +{ + fprintf(stderr, "Usage: opt2rst {synopsis|options}\n"); + exit(1); +} + int main(int argc, char * const *argv) { int i; - (void)argc; - (void)argv; - for (i = 0; i < sizeof vopt_full / sizeof vopt_full[0]; i++) - print_opt(&vopt_full[i]); + if (argc != 2) + usage(); + if (!strcmp(argv[1], "synopsis")) + printf(".. |synopsis| replace:: %s\n", vopt_synopsis); + else if (!strcmp(argv[1], "options")) + for (i = 0; i < sizeof vopt_full / sizeof vopt_full[0]; i++) + print_opt(&vopt_full[i]); + else + usage(); return (0); } From martin at varnish-cache.org Thu Jun 13 10:41:24 2013 From: martin at varnish-cache.org (Martin Blix Grydeland) Date: Thu, 13 Jun 2013 12:41:24 +0200 Subject: [master] d72139a Add -P (PID-file) and -D (daemon mode) options to the utilities libraries. Message-ID: commit d72139a6ab5c801f1898f08e3f192a4cd9cc441c Author: Martin Blix Grydeland Date: Mon May 27 16:29:59 2013 +0200 Add -P (PID-file) and -D (daemon mode) options to the utilities libraries. Make varnishlog use these options. diff --git a/bin/varnishlog/varnishlog_options.h b/bin/varnishlog/varnishlog_options.h index 801ca38..0eb4821 100644 --- a/bin/varnishlog/varnishlog_options.h +++ b/bin/varnishlog/varnishlog_options.h @@ -28,13 +28,16 @@ */ #include "vapi/vapi_options.h" +#include "vut_options.h" VSL_OPT_a VSL_OPT_d +VUT_OPT_D VSL_OPT_g VSL_OPT_i VSM_OPT_n VSM_OPT_N +VUT_OPT_P VSL_OPT_r VSL_OPT_u VSL_OPT_v diff --git a/include/vut.h b/include/vut.h index dcb5a77..a016994 100644 --- a/include/vut.h +++ b/include/vut.h @@ -35,7 +35,9 @@ struct VUT { /* Options */ int a_opt; int d_opt; + int D_opt; int g_arg; + char *P_arg; char *r_arg; int u_opt; char *w_arg; @@ -46,6 +48,7 @@ struct VUT { struct VSM_data *vsm; struct VSLQ *vslq; FILE *fo; + struct vpf_fh *pfh; }; extern struct VUT VUT; diff --git a/include/vut_options.h b/include/vut_options.h new file mode 100644 index 0000000..3034b41 --- /dev/null +++ b/include/vut_options.h @@ -0,0 +1,40 @@ +/*- + * Copyright (c) 2006 Verdens Gang AS + * Copyright (c) 2006-2013 Varnish Software AS + * All rights reserved. + * + * Author: Martin Blix Grydeland + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND + * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE + * ARE DISCLAIMED. IN NO EVENT SHALL AUTHOR OR CONTRIBUTORS BE LIABLE + * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL + * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS + * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) + * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT + * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY + * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF + * SUCH DAMAGE. + */ + +/* VUT options */ + +#define VUT_OPT_D \ + VOPT("D", "[-D]", "Daemonize", \ + "Daemonize." \ + ) + +#define VUT_OPT_P \ + VOPT("P:", "[-P file]", "PID file", \ + "Write the process' PID to the specified file." \ + ) diff --git a/lib/libvarnishtools/vut.c b/lib/libvarnishtools/vut.c index 29ff56b..f4f99c0 100644 --- a/lib/libvarnishtools/vut.c +++ b/lib/libvarnishtools/vut.c @@ -29,6 +29,8 @@ * Common functions for the utilities */ +#include "config.h" + #include #include #include @@ -38,6 +40,7 @@ #include #include "compat/daemon.h" +#include "vpf.h" #include "vapi/vsm.h" #include "vapi/vsc.h" #include "vapi/vsl.h" @@ -49,6 +52,15 @@ struct VUT VUT; +static void +vut_vpf_remove(void) +{ + if (VUT.pfh) { + VPF_Remove(VUT.pfh); + VUT.pfh = NULL; + } +} + void VUT_Error(int status, const char *fmt, ...) { @@ -84,6 +96,17 @@ VUT_Arg(int opt, const char *arg) /* Binary file append */ VUT.a_opt = 1; return (1); + case 'd': + /* Head */ + VUT.d_opt = 1; + return (1); + case 'D': + /* Daemon mode */ + VUT.D_opt = 1; + return (1); + case 'g': + /* Grouping */ + return (VUT_g_Arg(arg)); case 'n': /* Varnish instance */ if (VUT.vsm == NULL) @@ -92,13 +115,10 @@ VUT_Arg(int opt, const char *arg) if (VSM_n_Arg(VUT.vsm, arg) <= 0) VUT_Error(1, "%s", VSM_Error(VUT.vsm)); return (1); - case 'd': - /* Head */ - VUT.d_opt = 1; + case 'P': + /* PID file */ + REPLACE(VUT.P_arg, arg); return (1); - case 'g': - /* Grouping */ - return (VUT_g_Arg(arg)); case 'r': /* Binary file input */ REPLACE(VUT.r_arg, arg); @@ -165,12 +185,34 @@ VUT_Setup(void) if (VUT.vslq == NULL) VUT_Error(1, "Query parse error (%s)", VSL_Error(VUT.vsl)); AZ(c); + + /* Open PID file */ + if (VUT.P_arg) { + AZ(VUT.pfh); + VUT.pfh = VPF_Open(VUT.P_arg, 0644, NULL); + if (VUT.pfh == NULL) + VUT_Error(1, "%s: %s", VUT.P_arg, strerror(errno)); + } + + /* Daemon mode */ + if (VUT.D_opt && varnish_daemon(0, 0) == -1) + VUT_Error(1, "Daemon mode: %s", strerror(errno)); + + /* Write PID and setup exit handler */ + if (VUT.pfh != NULL) { + VPF_Write(VUT.pfh); + AZ(atexit(&vut_vpf_remove)); + } } void VUT_Fini(void) { free(VUT.r_arg); + free(VUT.P_arg); + + vut_vpf_remove(); + AZ(VUT.pfh); if (VUT.vslq) VSLQ_Delete(&VUT.vslq); From martin at varnish-cache.org Thu Jun 13 10:41:24 2013 From: martin at varnish-cache.org (Martin Blix Grydeland) Date: Thu, 13 Jun 2013 12:41:24 +0200 Subject: [master] 7a4da8d Rotate log on SIGHUP Message-ID: commit 7a4da8d1d21c0aff9ced3ff77bee608d2e7b8932 Author: Martin Blix Grydeland Date: Mon May 27 16:54:26 2013 +0200 Rotate log on SIGHUP diff --git a/include/vapi/vapi_options.h b/include/vapi/vapi_options.h index 6051308..71ef712 100644 --- a/include/vapi/vapi_options.h +++ b/include/vapi/vapi_options.h @@ -92,9 +92,7 @@ " them. The file will be overwritten unless the -a option" \ " was specified. If the application receives a SIGHUP" \ " while writing to a file, it will reopen the file" \ - " allowing the old one to be rotated away.\n" \ - "\n" \ - "XXX: Log rotation not yet implemented" \ + " allowing the old one to be rotated away." \ ) #define VSL_OPT_x \ diff --git a/lib/libvarnishtools/vut.c b/lib/libvarnishtools/vut.c index 05e7ff5..9682c7d 100644 --- a/lib/libvarnishtools/vut.c +++ b/lib/libvarnishtools/vut.c @@ -259,6 +259,18 @@ VUT_Main(VSLQ_dispatch_f *func, void *priv) } while (!VUT.sigint) { + if (VUT.w_arg && VUT.sighup) { + /* Rotate log */ + VUT.sighup = 0; + AN(VUT.fo); + fclose(VUT.fo); + VUT.fo = VSL_WriteOpen(VUT.vsl, VUT.w_arg, 0, + VUT.u_opt); + if (VUT.fo == NULL) + VUT_Error(1, "Can't open output file (%s)", + VSL_Error(VUT.vsl)); + } + if (VUT.vslq == NULL) { AZ(VUT.r_arg); AN(VUT.vsm); From martin at varnish-cache.org Thu Jun 13 10:41:24 2013 From: martin at varnish-cache.org (Martin Blix Grydeland) Date: Thu, 13 Jun 2013 12:41:24 +0200 Subject: [master] 0b2ae6b Add signal handlers to utilities library for graceful shutdown Message-ID: commit 0b2ae6b06614cab268393ad6d506dcfbe9a16314 Author: Martin Blix Grydeland Date: Mon May 27 16:31:28 2013 +0200 Add signal handlers to utilities library for graceful shutdown diff --git a/include/vut.h b/include/vut.h index a016994..0fdf8de 100644 --- a/include/vut.h +++ b/include/vut.h @@ -49,6 +49,8 @@ struct VUT { struct VSLQ *vslq; FILE *fo; struct vpf_fh *pfh; + int sighup; + int sigint; }; extern struct VUT VUT; diff --git a/lib/libvarnishtools/vut.c b/lib/libvarnishtools/vut.c index f4f99c0..05e7ff5 100644 --- a/lib/libvarnishtools/vut.c +++ b/lib/libvarnishtools/vut.c @@ -38,6 +38,7 @@ #include #include #include +#include #include "compat/daemon.h" #include "vpf.h" @@ -61,6 +62,20 @@ vut_vpf_remove(void) } } +static void +vut_sighup(int sig) +{ + (void)sig; + VUT.sighup = 1; +} + +static void +vut_sigint(int sig) +{ + (void)sig; + VUT.sigint = 1; +} + void VUT_Error(int status, const char *fmt, ...) { @@ -186,6 +201,11 @@ VUT_Setup(void) VUT_Error(1, "Query parse error (%s)", VSL_Error(VUT.vsl)); AZ(c); + /* Signal handlers */ + (void)signal(SIGHUP, vut_sighup); + (void)signal(SIGINT, vut_sigint); + (void)signal(SIGTERM, vut_sigint); + /* Open PID file */ if (VUT.P_arg) { AZ(VUT.pfh); @@ -238,8 +258,8 @@ VUT_Main(VSLQ_dispatch_f *func, void *priv) priv = VUT.fo; } - while (1) { - while (VUT.vslq == NULL) { + while (!VUT.sigint) { + if (VUT.vslq == NULL) { AZ(VUT.r_arg); AN(VUT.vsm); VTIM_sleep(0.1); From martin at varnish-cache.org Thu Jun 13 10:41:24 2013 From: martin at varnish-cache.org (Martin Blix Grydeland) Date: Thu, 13 Jun 2013 12:41:24 +0200 Subject: [master] 90919f4 Remove -D and -P "not yet implemented" messages from varnishlog.rst Message-ID: commit 90919f4ef16f9f1696d6bfdf6961fff17de3124b Author: Martin Blix Grydeland Date: Tue May 28 12:33:57 2013 +0200 Remove -D and -P "not yet implemented" messages from varnishlog.rst diff --git a/doc/sphinx/reference/varnishlog.rst b/doc/sphinx/reference/varnishlog.rst index a41e56d..5625b19 100644 --- a/doc/sphinx/reference/varnishlog.rst +++ b/doc/sphinx/reference/varnishlog.rst @@ -49,12 +49,6 @@ The following options are available: XXX: Not yet implemented --D - - Daemonize. - - XXX: Not yet implemented - -I [tag:]regex Output only records matching this regular expression. If tag @@ -70,12 +64,6 @@ The following options are available: XXX: Not yet implemented --P file - - Write the process' PID to the specified file. - - XXX: Not yet implemented - -s num Skip the first num log transactions (or log records if From martin at varnish-cache.org Thu Jun 13 10:41:24 2013 From: martin at varnish-cache.org (Martin Blix Grydeland) Date: Thu, 13 Jun 2013 12:41:24 +0200 Subject: [master] 4617471 Remove -u "not yet implemented" from varnishlog.rst Message-ID: commit 46174712c56e1a3e00942e0d4b24d3797d3cb7ad Author: Martin Blix Grydeland Date: Tue May 28 12:34:18 2013 +0200 Remove -u "not yet implemented" from varnishlog.rst diff --git a/doc/sphinx/reference/varnishlog.rst b/doc/sphinx/reference/varnishlog.rst index 5625b19..d7cc290 100644 --- a/doc/sphinx/reference/varnishlog.rst +++ b/doc/sphinx/reference/varnishlog.rst @@ -71,12 +71,6 @@ The following options are available: XXX: Not yet implemented --u - - Unbuffered output. - - XXX: Not yet implemented - -V Display the version number and exit. From martin at varnish-cache.org Thu Jun 13 10:41:24 2013 From: martin at varnish-cache.org (Martin Blix Grydeland) Date: Thu, 13 Jun 2013 12:41:24 +0200 Subject: [master] 2aeb0ae Add vut_options to dist Message-ID: commit 2aeb0aec46d2ac6a0c44ebc8fa950cebed74220a Author: Martin Blix Grydeland Date: Wed May 29 15:34:47 2013 +0200 Add vut_options to dist diff --git a/include/Makefile.am b/include/Makefile.am index bc55f85..92cc1bd 100644 --- a/include/Makefile.am +++ b/include/Makefile.am @@ -62,7 +62,8 @@ nobase_noinst_HEADERS = \ vtcp.h \ vtim.h \ vtree.h \ - vut.h + vut.h \ + vut_options.h # Headers for use with vmods pkgdataincludedir = $(pkgdatadir)/include From martin at varnish-cache.org Thu Jun 13 10:41:24 2013 From: martin at varnish-cache.org (Martin Blix Grydeland) Date: Thu, 13 Jun 2013 12:41:24 +0200 Subject: [master] bcd9ef1 Implement -I and -X varnishlog options Message-ID: commit bcd9ef1e9a961e0d9f2d9d70ef204bf239d60e97 Author: Martin Blix Grydeland Date: Wed May 29 11:09:06 2013 +0200 Implement -I and -X varnishlog options diff --git a/bin/varnishlog/varnishlog_options.h b/bin/varnishlog/varnishlog_options.h index 0eb4821..5c09c58 100644 --- a/bin/varnishlog/varnishlog_options.h +++ b/bin/varnishlog/varnishlog_options.h @@ -35,6 +35,7 @@ VSL_OPT_d VUT_OPT_D VSL_OPT_g VSL_OPT_i +VSL_OPT_I VSM_OPT_n VSM_OPT_N VUT_OPT_P @@ -43,3 +44,4 @@ VSL_OPT_u VSL_OPT_v VSL_OPT_w VSL_OPT_x +VSL_OPT_X diff --git a/doc/sphinx/reference/varnishlog.rst b/doc/sphinx/reference/varnishlog.rst index d7cc290..6c1fd9d 100644 --- a/doc/sphinx/reference/varnishlog.rst +++ b/doc/sphinx/reference/varnishlog.rst @@ -49,14 +49,6 @@ The following options are available: XXX: Not yet implemented --I [tag:]regex - - Output only records matching this regular expression. If tag - is given, limit the regex matching to records of that - tag. Multiple -I options may be given. - - XXX: Not yet implemented - -k num Only show the first num log transactions (or log records @@ -78,15 +70,6 @@ The following options are available: XXX: Not yet implemented --X [tag:]regex - - Do not output log records matching this regex. If tag is - given, limit the regex matching to records of that tag. - Multiple -X options may be given. - - XXX: Not yet implemented - - DESCRIPTION =========== diff --git a/include/vapi/vapi_options.h b/include/vapi/vapi_options.h index 71ef712..a6f316a 100644 --- a/include/vapi/vapi_options.h +++ b/include/vapi/vapi_options.h @@ -43,6 +43,10 @@ /* VSL options */ +#define VSL_iI_PS \ + "If a tag include option is the first of any tag selection" \ + " options, all tags are first marked excluded." + #define VSL_OPT_a \ VOPT("a", "[-a]", "Append binary file output", \ "When writing binary output to a file, append to it rather" \ @@ -63,10 +67,19 @@ #define VSL_OPT_i \ VOPT("i:", "[-i tag]", "Include tag", \ - "Output only this tag. Multiple -i options may be given." \ + "Include log records of this tag in output. Multiple -i" \ + " options may be given.\n" \ + "\n" \ + VSL_iI_PS \ + ) + +#define VSL_OPT_I \ + VOPT("I:", "[-I [tag:]regex]", "Include by regex", \ + "Include by regex matching. Output only records matching" \ + " tag and regular expression. Applies to any tag if tag" \ + " is * or empty.\n" \ "\n" \ - "If an -i option is the first of any -ix options, all tags" \ - " are disabled before -ix processing." \ + VSL_iI_PS \ ) #define VSL_OPT_r \ @@ -97,9 +110,13 @@ #define VSL_OPT_x \ VOPT("x:", "[-x tag]", "Exclude tag", \ - "Exclude log records of this tag. Multiple -x options" \ - " may be given.\n" \ - "\n" \ - "If an -x option is the first of any -ix options, all tags" \ - " are enabled for output before -ix processing." \ + "Exclude log records of this tag in output. Multiple -x" \ + " options may be given." \ + ) + +#define VSL_OPT_X \ + VOPT("X:", "[-X [tag:]regex]", "Exclude by regex", \ + "Exclude by regex matching. Do not output records matching" \ + " tag and regular expression. Applies to any tag if tag" \ + " is * or empty." \ ) diff --git a/lib/libvarnishapi/vsl.c b/lib/libvarnishapi/vsl.c index 28d0e44..f263862 100644 --- a/lib/libvarnishapi/vsl.c +++ b/lib/libvarnishapi/vsl.c @@ -94,10 +94,27 @@ VSL_New(void) vsl->vbm_select = vbit_init(256); vsl->vbm_supress = vbit_init(256); + VTAILQ_INIT(&vsl->vslf_select); + VTAILQ_INIT(&vsl->vslf_suppress); return (vsl); } +static void +vsl_IX_free(vslf_list *list) +{ + struct vslf *vslf; + + while (!VTAILQ_EMPTY(list)) { + vslf = VTAILQ_FIRST(list); + CHECK_OBJ_NOTNULL(vslf, VSLF_MAGIC); + VTAILQ_REMOVE(list, vslf, list); + AN(vslf->vre); + VRE_free(&vslf->vre); + AZ(vslf->vre); + } +} + void VSL_Delete(struct VSL_data *vsl) { @@ -106,6 +123,8 @@ VSL_Delete(struct VSL_data *vsl) vbit_destroy(vsl->vbm_select); vbit_destroy(vsl->vbm_supress); + vsl_IX_free(&vsl->vslf_select); + vsl_IX_free(&vsl->vslf_suppress); VSL_ResetError(vsl); FREE_OBJ(vsl); } @@ -134,6 +153,29 @@ VSL_ResetError(struct VSL_data *vsl) vsl->diag = NULL; } +static int +vsl_match_IX(struct VSL_data *vsl, vslf_list *list, const struct VSL_cursor *c) +{ + enum VSL_tag_e tag; + const char *cdata; + int len; + const struct vslf *vslf; + + (void)vsl; + tag = VSL_TAG(c->rec.ptr); + cdata = VSL_CDATA(c->rec.ptr); + len = VSL_LEN(c->rec.ptr); + + VTAILQ_FOREACH(vslf, list, list) { + CHECK_OBJ_NOTNULL(vslf, VSLF_MAGIC); + if (vslf->tag >= 0 && vslf->tag != tag) + continue; + if (VRE_exec(vslf->vre, cdata, len, 0, 0, NULL, 0, NULL) >= 0) + return (1); + } + return (0); +} + int VSL_Match(struct VSL_data *vsl, const struct VSL_cursor *c) { @@ -145,8 +187,14 @@ VSL_Match(struct VSL_data *vsl, const struct VSL_cursor *c) tag = VSL_TAG(c->rec.ptr); if (tag <= SLT__Bogus || tag >= SLT__Reserved) return (0); - if (vbit_test(vsl->vbm_select, tag)) + if (!VTAILQ_EMPTY(&vsl->vslf_select) && + vsl_match_IX(vsl, &vsl->vslf_select, c)) + return (1); + else if (vbit_test(vsl->vbm_select, tag)) return (1); + else if (!VTAILQ_EMPTY(&vsl->vslf_suppress) && + vsl_match_IX(vsl, &vsl->vslf_suppress, c)) + return (0); else if (vbit_test(vsl->vbm_supress, tag)) return (0); @@ -268,7 +316,9 @@ VSL_PrintTransactions(struct VSL_data *vsl, struct VSL_transaction *pt[], int delim = 0; int verbose; - (void)vsl; + CHECK_OBJ_NOTNULL(vsl, VSL_MAGIC); + if (fo == NULL) + fo = stdout; if (pt[0] == NULL) return (0); diff --git a/lib/libvarnishapi/vsl_api.h b/lib/libvarnishapi/vsl_api.h index 05987a6..989effd 100644 --- a/lib/libvarnishapi/vsl_api.h +++ b/lib/libvarnishapi/vsl_api.h @@ -31,6 +31,7 @@ #include "vdef.h" #include "vqueue.h" +#include "vre.h" #include "vapi/vsm.h" #define VSL_FILE_ID "VSL" @@ -67,6 +68,17 @@ struct vslc { const struct vslc_tbl *tbl; }; +struct vslf { + unsigned magic; +#define VSLF_MAGIC 0x08650B39 + VTAILQ_ENTRY(vslf) list; + + int tag; + vre_t *vre; +}; + +typedef VTAILQ_HEAD(,vslf) vslf_list; + struct VSL_data { unsigned magic; #undef VSL_MAGIC @@ -75,12 +87,16 @@ struct VSL_data { struct vsb *diag; unsigned flags; -#define F_SEEN_ix (1 << 0) +#define F_SEEN_ixIX (1 << 0) /* Bitmaps of -ix selected tags */ struct vbitmap *vbm_select; struct vbitmap *vbm_supress; + /* Lists of -IX filters */ + vslf_list vslf_select; + vslf_list vslf_suppress; + int v_opt; }; diff --git a/lib/libvarnishapi/vsl_arg.c b/lib/libvarnishapi/vsl_arg.c index c0c964a..22af6a4 100644 --- a/lib/libvarnishapi/vsl_arg.c +++ b/lib/libvarnishapi/vsl_arg.c @@ -120,11 +120,7 @@ vsl_ix_arg(struct VSL_data *vsl, int opt, const char *arg) const char *b, *e; CHECK_OBJ_NOTNULL(vsl, VSL_MAGIC); - /* If first option is 'i', set all bits for supression */ - if (opt == 'i' && !(vsl->flags & F_SEEN_ix)) - for (i = 0; i < 256; i++) - vbit_set(vsl->vbm_supress, i); - vsl->flags |= F_SEEN_ix; + vsl->flags |= F_SEEN_ixIX; for (b = arg; *b; b = e) { while (isspace(*b)) @@ -156,11 +152,80 @@ vsl_ix_arg(struct VSL_data *vsl, int opt, const char *arg) return (1); } +static int +vsl_IX_arg(struct VSL_data *vsl, int opt, const char *arg) +{ + int i, l, off; + const char *b, *e, *err; + vre_t *vre; + struct vslf *vslf; + + CHECK_OBJ_NOTNULL(vsl, VSL_MAGIC); + vsl->flags |= F_SEEN_ixIX; + + l = 0; + b = arg; + e = strchr(b, ':'); + if (e) { + while (isspace(*b)) + b++; + l = e - b; + while (l > 0 && isspace(b[l - 1])) + l--; + } + if (l > 0 && strncmp(b, "*", l)) + i = VSL_Name2Tag(b, l); + else + i = -3; + if (i == -2) + return (vsl_diag(vsl, + "-%c: \"%*.*s\" matches multiple tags\n", + (char)opt, l, l, b)); + else if (i == -1) + return (vsl_diag(vsl, + "-%c: Could not match \"%*.*s\" to any tag\n", + (char)opt, l, l, b)); + assert(i >= -3); + + if (e) + b = e + 1; + vre = VRE_compile(b, 0, &err, &off); + if (vre == NULL) + return (vsl_diag(vsl, "-%c: Regex error at position %d (%s)\n", + (char)opt, off, err)); + + ALLOC_OBJ(vslf, VSLF_MAGIC); + if (vslf == NULL) { + VRE_free(&vre); + return (vsl_diag(vsl, "Out of memory")); + } + vslf->tag = i; + vslf->vre = vre; + + if (opt == 'I') + VTAILQ_INSERT_TAIL(&vsl->vslf_select, vslf, list); + else { + assert(opt == 'X'); + VTAILQ_INSERT_TAIL(&vsl->vslf_suppress, vslf, list); + } + + return (1); +} + int VSL_Arg(struct VSL_data *vsl, int opt, const char *arg) { + int i; + + CHECK_OBJ_NOTNULL(vsl, VSL_MAGIC); + /* If first option is 'i', set all bits for supression */ + if ((opt == 'i' || opt == 'I') && !(vsl->flags & F_SEEN_ixIX)) + for (i = 0; i < 256; i++) + vbit_set(vsl->vbm_supress, i); + switch (opt) { - case 'i': case'x': return (vsl_ix_arg(vsl, opt, arg)); + case 'i': case 'x': return (vsl_ix_arg(vsl, opt, arg)); + case 'I': case 'X': return (vsl_IX_arg(vsl, opt, arg)); case 'v': vsl->v_opt = 1; return (1); default: return (0); diff --git a/lib/libvarnishtools/vut.c b/lib/libvarnishtools/vut.c index 9682c7d..d3e5abd 100644 --- a/lib/libvarnishtools/vut.c +++ b/lib/libvarnishtools/vut.c @@ -106,6 +106,8 @@ VUT_g_Arg(const char *arg) int VUT_Arg(int opt, const char *arg) { + int i; + switch (opt) { case 'a': /* Binary file append */ @@ -148,7 +150,10 @@ VUT_Arg(int opt, const char *arg) return (1); default: AN(VUT.vsl); - return (VSL_Arg(VUT.vsl, opt, arg)); + i = VSL_Arg(VUT.vsl, opt, arg); + if (i < 0) + VUT_Error(1, "%s", VSL_Error(VUT.vsl)); + return (i); } } From martin at varnish-cache.org Thu Jun 13 10:41:24 2013 From: martin at varnish-cache.org (Martin Blix Grydeland) Date: Thu, 13 Jun 2013 12:41:24 +0200 Subject: [master] ee74961 Fix option RST generation dependant on rst2man to not make use of gnu make order only prerequisites. Message-ID: commit ee74961527e07870a6055944c886619ff7eee224 Author: Martin Blix Grydeland Date: Wed May 29 15:51:55 2013 +0200 Fix option RST generation dependant on rst2man to not make use of gnu make order only prerequisites. diff --git a/bin/varnishlog/Makefile.am b/bin/varnishlog/Makefile.am index d9971fc..e5afa45 100644 --- a/bin/varnishlog/Makefile.am +++ b/bin/varnishlog/Makefile.am @@ -9,6 +9,7 @@ dist_man_MANS = varnishlog.1 varnishlog_SOURCES = \ varnishlog.c \ varnishlog_options.h \ + varnishlog_options.c \ $(top_builddir)/lib/libvarnishtools/vut.c \ $(top_builddir)/lib/libvarnish/vas.c \ $(top_builddir)/lib/libvarnish/flopen.c \ @@ -23,21 +24,24 @@ varnishlog_LDADD = \ ${RT_LIBS} ${LIBM} ${PTHREAD_LIBS} noinst_PROGRAMS = varnishlog_opt2rst -varnishlog_opt2rst_CFLAGS = -DOPT2RST_INC="varnishlog_options.h" varnishlog_opt2rst_SOURCES = \ varnishlog_options.h \ + varnishlog_options.c \ $(top_builddir)/lib/libvarnishtools/opt2rst.c BUILT_SOURCES = varnishlog_options.rst varnishlog_synopsis.rst - -EXTRA_DIST = varnishlog_options.rst varnishlog_synopsis.rst +EXTRA_DIST = $(BUILT_SOURCES) MAINTAINERCLEANFILES = $(EXTRA_DIST) -varnishlog_options.rst: varnishlog_options.h | varnishlog_opt2rst +varnishlog_options.rst: ./varnishlog_opt2rst options > $@ -varnishlog_synopsis.rst: varnishlog_options.h | varnishlog_opt2rst +varnishlog_synopsis.rst: ./varnishlog_opt2rst synopsis > $@ +if HAVE_RST2MAN +varnishlog_options.rst varnishlog_synopsis.rst: varnishlog_opt2rst +endif + varnishlog.1: \ $(top_srcdir)/doc/sphinx/reference/varnishlog.rst \ varnishlog_options.rst \ diff --git a/bin/varnishlog/varnishlog.c b/bin/varnishlog/varnishlog.c index 13075fd..d3d9e29 100644 --- a/bin/varnishlog/varnishlog.c +++ b/bin/varnishlog/varnishlog.c @@ -42,6 +42,7 @@ #include "vapi/vsm.h" #include "vapi/vsl.h" +#include "vapi/voptget.h" #include "vas.h" #include "vcs.h" #include "vpf.h" @@ -49,12 +50,6 @@ #include "vtim.h" #include "vut.h" -#define VOPT_OPTSTRING -#define VOPT_SYNOPSIS -#define VOPT_USAGE -#define VOPT_INC "varnishlog_options.h" -#include "vapi/voptget.h" - static void usage(void) { diff --git a/bin/varnishlog/varnishlog_options.c b/bin/varnishlog/varnishlog_options.c new file mode 100644 index 0000000..4b5037d --- /dev/null +++ b/bin/varnishlog/varnishlog_options.c @@ -0,0 +1,35 @@ +/*- + * Copyright (c) 2006 Verdens Gang AS + * Copyright (c) 2006-2013 Varnish Software AS + * All rights reserved. + * + * Author: Martin Blix Grydeland + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND + * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE + * ARE DISCLAIMED. IN NO EVENT SHALL AUTHOR OR CONTRIBUTORS BE LIABLE + * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL + * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS + * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) + * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT + * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY + * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF + * SUCH DAMAGE. + * + * Option definitions for varnishlog + */ + +#include +#define VOPT_DEFINITION +#define VOPT_INC "varnishlog_options.h" +#include "vapi/voptget.h" diff --git a/include/vapi/voptget.h b/include/vapi/voptget.h index 443d316..4f30848 100644 --- a/include/vapi/voptget.h +++ b/include/vapi/voptget.h @@ -35,47 +35,48 @@ * l: Long description */ -#ifdef VOPT_OPTSTRING +extern const char vopt_optstring[]; +extern const char vopt_synopsis[]; +extern const char *vopt_usage[]; +struct vopt_list { + const char *option; + const char *synopsis; + const char *desc; + const char *ldesc; +}; +extern const struct vopt_list vopt_list[]; +extern unsigned vopt_list_n; + +#ifdef VOPT_DEFINITION + +#ifndef VOPT_INC +#error "VOPT_INC undefined" +#endif + #define VOPT(o,s,d,l) o const char vopt_optstring[] = #include VOPT_INC ; #undef VOPT -#endif -#ifdef VOPT_SYNOPSIS #define VOPT(o,s,d,l) " " s const char vopt_synopsis[] = #include VOPT_INC ; #undef VOPT -#endif -#ifdef VOPT_USAGE #define VOPT(o,s,d,l) s, d, const char *vopt_usage[] = { #include VOPT_INC NULL, NULL, }; #undef VOPT -#endif -#ifndef VOPTGET_H -#define VOPTGET_H - -struct vopt_full { - const char *option; - const char *synopsis; - const char *desc; - const char *ldesc; -}; - -#endif - -#ifdef VOPT_FULL #define VOPT(o,s,d,l) { o,s,d,l }, -const struct vopt_full vopt_full[] = { +const struct vopt_list vopt_list[] = { #include VOPT_INC }; #undef VOPT -#endif +unsigned vopt_list_n = sizeof vopt_list / sizeof vopt_list[0]; + +#endif /* VOPT_DEFINITION */ diff --git a/lib/libvarnishtools/opt2rst.c b/lib/libvarnishtools/opt2rst.c index 3ad60db..1ed08e0 100644 --- a/lib/libvarnishtools/opt2rst.c +++ b/lib/libvarnishtools/opt2rst.c @@ -31,16 +31,6 @@ #include #include -#ifndef OPT2RST_INC -#error "OPT2RST_INC undefined" -#endif - -#define STRINGIFY(x) #x -#define TOSTRING(x) STRINGIFY(x) - -#define VOPT_SYNOPSIS -#define VOPT_FULL -#define VOPT_INC TOSTRING(OPT2RST_INC) #include "vapi/voptget.h" static void @@ -68,7 +58,7 @@ print_tabbed(const char *string, int tabs) } static void -print_opt(const struct vopt_full *opt) +print_opt(const struct vopt_list *opt) { print_nobrackets(opt->synopsis); printf("\n\n"); @@ -93,8 +83,8 @@ main(int argc, char * const *argv) if (!strcmp(argv[1], "synopsis")) printf(".. |synopsis| replace:: %s\n", vopt_synopsis); else if (!strcmp(argv[1], "options")) - for (i = 0; i < sizeof vopt_full / sizeof vopt_full[0]; i++) - print_opt(&vopt_full[i]); + for (i = 0; i < vopt_list_n; i++) + print_opt(&vopt_list[i]); else usage(); From martin at varnish-cache.org Thu Jun 13 10:41:24 2013 From: martin at varnish-cache.org (Martin Blix Grydeland) Date: Thu, 13 Jun 2013 12:41:24 +0200 Subject: [master] 013daf9 Fix query test reversal bug Message-ID: commit 013daf9841c12024dc634ffdfd1bacfa4f8d6088 Author: Martin Blix Grydeland Date: Tue Jun 11 15:35:51 2013 +0200 Fix query test reversal bug diff --git a/lib/libvarnishapi/vsl_dispatch.c b/lib/libvarnishapi/vsl_dispatch.c index 8256352..79fb02a 100644 --- a/lib/libvarnishapi/vsl_dispatch.c +++ b/lib/libvarnishapi/vsl_dispatch.c @@ -749,7 +749,7 @@ vslq_callback(struct VSLQ *vslq, struct vtx *vtx, VSLQ_dispatch_f *func, ptrans[i] = NULL; /* Query test goes here */ - if (vslq->query != NULL && vslq_runquery(vslq->query, ptrans)) + if (vslq->query != NULL && !vslq_runquery(vslq->query, ptrans)) return (0); /* Callback */ @@ -859,7 +859,7 @@ vslq_raw(struct VSLQ *vslq, VSLQ_dispatch_f *func, void *priv) trans.vxid = VSL_ID(c->rec.ptr); /* Query check goes here */ - if (vslq->query != NULL && vslq_runquery(vslq->query, ptrans)) + if (vslq->query != NULL && !vslq_runquery(vslq->query, ptrans)) continue; /* Callback */ From martin at varnish-cache.org Thu Jun 13 10:41:24 2013 From: martin at varnish-cache.org (Martin Blix Grydeland) Date: Thu, 13 Jun 2013 12:41:24 +0200 Subject: [master] 673a855 Fix error message print on undefined macro expansion in varnishtest. It would print to end of line instead of parsed macro name length. Message-ID: commit 673a8555d859a3785597480c0adc7d7ec98bd4dd Author: Martin Blix Grydeland Date: Wed Jun 5 11:11:35 2013 +0200 Fix error message print on undefined macro expansion in varnishtest. It would print to end of line instead of parsed macro name length. diff --git a/bin/varnishtest/vtc.c b/bin/varnishtest/vtc.c index a2c0571..aeb4296 100644 --- a/bin/varnishtest/vtc.c +++ b/bin/varnishtest/vtc.c @@ -196,7 +196,8 @@ macro_expand(struct vtclog *vl, const char *text) m = macro_get(p, q); if (m == NULL) { VSB_delete(vsb); - vtc_log(vl, 0, "Macro ${%s} not found", p); + vtc_log(vl, 0, "Macro ${%.*s} not found", (int)(q - p), + p); return (NULL); } VSB_printf(vsb, "%s", m); From tfheen at varnish-cache.org Thu Jun 13 12:34:28 2013 From: tfheen at varnish-cache.org (Tollef Fog Heen) Date: Thu, 13 Jun 2013 14:34:28 +0200 Subject: [3.0] 9f83e8f Compile fix for Solaris Message-ID: commit 9f83e8fdbd2a54047e4fa909758260db59a5e4b6 Author: Tollef Fog Heen Date: Thu Jun 13 14:34:21 2013 +0200 Compile fix for Solaris diff --git a/bin/varnishd/cache_waiter_ports.c b/bin/varnishd/cache_waiter_ports.c index 6973080..a46e35a 100644 --- a/bin/varnishd/cache_waiter_ports.c +++ b/bin/varnishd/cache_waiter_ports.c @@ -250,7 +250,7 @@ static void vca_ports_pass(struct sess *sp) { int r; - r = port_send(vws->dport, 0, TRUST_ME(sp)); + r = port_send(solaris_dport, 0, TRUST_ME(sp)); if (r == -1 && errno == EAGAIN) { VSC_C_main->sess_pipe_overflow++; vca_close_session(sp, "session pipe overflow"); From martin at varnish-cache.org Thu Jun 13 12:47:47 2013 From: martin at varnish-cache.org (Martin Blix Grydeland) Date: Thu, 13 Jun 2013 14:47:47 +0200 Subject: [master] 6b2a3a3 Fix rst option generation Message-ID: commit 6b2a3a3909275a6cc0dbeccbb5356c732f176c9c Author: Martin Blix Grydeland Date: Thu Jun 13 14:47:26 2013 +0200 Fix rst option generation diff --git a/include/vapi/vapi_options.h b/include/vapi/vapi_options.h index a6f316a..df3dc83 100644 --- a/include/vapi/vapi_options.h +++ b/include/vapi/vapi_options.h @@ -60,7 +60,7 @@ ) #define VSL_OPT_g \ - VOPT("g:", "[-g {session|request|vxid|raw}]", "Grouping mode", \ + VOPT("g:", "[-g ]", "Grouping mode", \ "The grouping of the log records. The default is to group" \ " by request." \ ) @@ -74,7 +74,7 @@ ) #define VSL_OPT_I \ - VOPT("I:", "[-I [tag:]regex]", "Include by regex", \ + VOPT("I:", "[-I <[tag:]regex>]", "Include by regex", \ "Include by regex matching. Output only records matching" \ " tag and regular expression. Applies to any tag if tag" \ " is * or empty.\n" \ @@ -115,7 +115,7 @@ ) #define VSL_OPT_X \ - VOPT("X:", "[-X [tag:]regex]", "Exclude by regex", \ + VOPT("X:", "[-X <[tag:]regex>]", "Exclude by regex", \ "Exclude by regex matching. Do not output records matching" \ " tag and regular expression. Applies to any tag if tag" \ " is * or empty." \ diff --git a/lib/libvarnishtools/opt2rst.c b/lib/libvarnishtools/opt2rst.c index 1ed08e0..68d5200 100644 --- a/lib/libvarnishtools/opt2rst.c +++ b/lib/libvarnishtools/opt2rst.c @@ -30,17 +30,29 @@ #include #include #include +#include #include "vapi/voptget.h" static void print_nobrackets(const char *s) { - for (; *s; s++) { - if (strchr("[]", *s)) - continue; - printf("%c", *s); + const char *e; + + /* Remove whitespace */ + while (isspace(*s)) + s++; + e = s + strlen(s); + while (e > s && isspace(e[-1])) + e--; + + /* Remove outer layer brackets if present */ + if (e > s && *s == '[' && e[-1] == ']') { + s++; + e--; } + + printf("%.*s", (int)(e - s), s); } static void From martin at varnish-cache.org Thu Jun 13 14:01:18 2013 From: martin at varnish-cache.org (Martin Blix Grydeland) Date: Thu, 13 Jun 2013 16:01:18 +0200 Subject: [master] 3719a3b Use top_srcdir instead of top_builddir in varnishlog Makefile.am when referring to source files. Message-ID: commit 3719a3b08c539ca337cc54b2bed1dfc083d77498 Author: Martin Blix Grydeland Date: Thu Jun 13 16:00:41 2013 +0200 Use top_srcdir instead of top_builddir in varnishlog Makefile.am when referring to source files. diff --git a/bin/varnishlog/Makefile.am b/bin/varnishlog/Makefile.am index e5afa45..5483e78 100644 --- a/bin/varnishlog/Makefile.am +++ b/bin/varnishlog/Makefile.am @@ -10,13 +10,13 @@ varnishlog_SOURCES = \ varnishlog.c \ varnishlog_options.h \ varnishlog_options.c \ - $(top_builddir)/lib/libvarnishtools/vut.c \ - $(top_builddir)/lib/libvarnish/vas.c \ - $(top_builddir)/lib/libvarnish/flopen.c \ - $(top_builddir)/lib/libvarnish/version.c \ - $(top_builddir)/lib/libvarnish/vsb.c \ - $(top_builddir)/lib/libvarnish/vpf.c \ - $(top_builddir)/lib/libvarnish/vtim.c + $(top_srcdir)/lib/libvarnishtools/vut.c \ + $(top_srcdir)/lib/libvarnish/vas.c \ + $(top_srcdir)/lib/libvarnish/flopen.c \ + $(top_srcdir)/lib/libvarnish/version.c \ + $(top_srcdir)/lib/libvarnish/vsb.c \ + $(top_srcdir)/lib/libvarnish/vpf.c \ + $(top_srcdir)/lib/libvarnish/vtim.c varnishlog_LDADD = \ $(top_builddir)/lib/libvarnishcompat/libvarnishcompat.la \ @@ -27,7 +27,7 @@ noinst_PROGRAMS = varnishlog_opt2rst varnishlog_opt2rst_SOURCES = \ varnishlog_options.h \ varnishlog_options.c \ - $(top_builddir)/lib/libvarnishtools/opt2rst.c + $(top_srcdir)/lib/libvarnishtools/opt2rst.c BUILT_SOURCES = varnishlog_options.rst varnishlog_synopsis.rst EXTRA_DIST = $(BUILT_SOURCES) From perbu at varnish-cache.org Thu Jun 13 19:23:25 2013 From: perbu at varnish-cache.org (Per Buer) Date: Thu, 13 Jun 2013 21:23:25 +0200 Subject: [master] 5bb415c More on crashing varnishes. Message-ID: commit 5bb415cc58ed8f52db8b6802a9b85e9218b54ecc Author: Per Buer Date: Thu Jun 13 21:19:45 2013 +0200 More on crashing varnishes. diff --git a/doc/sphinx/users-guide/troubleshooting.rst b/doc/sphinx/users-guide/troubleshooting.rst index 1735c2b..c399958 100644 --- a/doc/sphinx/users-guide/troubleshooting.rst +++ b/doc/sphinx/users-guide/troubleshooting.rst @@ -51,36 +51,51 @@ of Varnish. If this doesn't help try strace or truss or come find us on IRC. -Varnish is crashing -------------------- +Varnish is crashing - panics +---------------------------- + +When Varnish goes bust the child processes crashes. Most of the +crashes are caught by one of the many consistency checks spread around +the Varnish source code. When Varnish hits one of these the caching +process it will crash itself in a controlled manner, leaving a nice +stack trace with the mother process. + +You can inspect any panic messages by typing panic.show in the CLI. + +| panic.show +| Last panic at: Tue, 15 Mar 2011 13:09:05 GMT +| Assert error in ESI_Deliver(), cache_esi_deliver.c line 354: +| Condition(i == Z_OK || i == Z_STREAM_END) not true. +| thread = (cache-worker) +| ident = Linux,2.6.32-28-generic,x86_64,-sfile,-smalloc,-hcritbit,epoll +| Backtrace: +| 0x42cbe8: pan_ic+b8 +| 0x41f778: ESI_Deliver+438 +| 0x42f838: RES_WriteObj+248 +| 0x416a70: cnt_deliver+230 +| 0x4178fd: CNT_Session+31d +| (..) + +The crash might be due to misconfiguration or a bug. If you suspect it +is a bug you can use the output in a bug report. + +Varnish is crashing - segfaults +------------------------------- + +Sometimes the bug escapes the consistency checks and Varnish get hit +with a segmentation error. When this happens with the child process it +is logged, the core is dumped and the child process starts up again. + +A core dumped is usually due to a bug in Varnish. However, in order to +debug a segfault the developers need you to provide a fair bit of +data. + + * Make sure you have Varnish installed with symbols + * Make sure core dumps are enabled (ulimit) + +Once you have the core you open it with gdb and issue the command "bt" +to get a stack trace of the thread that caused the segfault. -When varnish goes bust the child processes crashes. Usually the mother -process will manage this by restarting the child process again. Any -errors will be logged in syslog. It might look like this:: - - Mar 8 13:23:38 smoke varnishd[15670]: Child (15671) not responding to CLI, killing it. - Mar 8 13:23:43 smoke varnishd[15670]: last message repeated 2 times - Mar 8 13:23:43 smoke varnishd[15670]: Child (15671) died signal=3 - Mar 8 13:23:43 smoke varnishd[15670]: Child cleanup complete - Mar 8 13:23:43 smoke varnishd[15670]: child (15697) Started - -In this situation the mother process assumes that the cache died and -killed it off. - -In certain situation the child process might crash itself. This might -happen because internal integrity checks fail as a result of a bug. - -In these situations the child will start back up again right away but -the cache will be cleared. A panic is logged with the mother -process. You can inspect the stack trace with the CLI command -panic.show. - -Some of these situations might be caused by bugs, other by -misconfigations. Often we see varnish running out of session -workspace, which will result in the child aborting its execution. - -In a rare event you might also see a segmentation fault or bus -error. These are either bugs, kernel- or hardware failures. Varnish gives me Guru meditation -------------------------------- From perbu at varnish-cache.org Thu Jun 13 19:23:25 2013 From: perbu at varnish-cache.org (Per Buer) Date: Thu, 13 Jun 2013 21:23:25 +0200 Subject: [master] 3df0543 rh/centos config location Message-ID: commit 3df05439cb2123ac97c5776395e5f9ab4bc71e19 Author: Per Buer Date: Thu Jun 13 21:20:32 2013 +0200 rh/centos config location diff --git a/doc/sphinx/tutorial/putting_varnish_on_port_80.rst b/doc/sphinx/tutorial/putting_varnish_on_port_80.rst index 05b8d17..db3c7fb 100644 --- a/doc/sphinx/tutorial/putting_varnish_on_port_80.rst +++ b/doc/sphinx/tutorial/putting_varnish_on_port_80.rst @@ -36,7 +36,7 @@ Red Hat EL / Centos ~~~~~~~~~~~~~~~~~~~ On Red Hat EL / Centos -On Red Hat/Centos it is +On Red Hat/Centos it is /etc/sysconfig/varnish Restarting Varnish From phk at varnish-cache.org Fri Jun 14 08:38:46 2013 From: phk at varnish-cache.org (Poul-Henning Kamp) Date: Fri, 14 Jun 2013 10:38:46 +0200 Subject: [master] 2afeca2 Make bmake happy Message-ID: commit 2afeca25f454bb9c07f305bf25294486c96da77e Author: Poul-Henning Kamp Date: Fri Jun 14 08:38:34 2013 +0000 Make bmake happy diff --git a/bin/varnishlog/Makefile.am b/bin/varnishlog/Makefile.am index 5483e78..05dbcbe 100644 --- a/bin/varnishlog/Makefile.am +++ b/bin/varnishlog/Makefile.am @@ -47,7 +47,7 @@ varnishlog.1: \ varnishlog_options.rst \ varnishlog_synopsis.rst if HAVE_RST2MAN - ${RST2MAN} $< $@ + ${RST2MAN} $? $@ else @echo "========================================" @echo "You need rst2man installed to make dist" From phk at varnish-cache.org Fri Jun 14 08:43:07 2013 From: phk at varnish-cache.org (Poul-Henning Kamp) Date: Fri, 14 Jun 2013 10:43:07 +0200 Subject: [master] 31b1260 Also try to make rst2man happy. Message-ID: commit 31b1260cd0124c6fe2316a0383594b85cda0e562 Author: Poul-Henning Kamp Date: Fri Jun 14 08:42:48 2013 +0000 Also try to make rst2man happy. diff --git a/bin/varnishlog/Makefile.am b/bin/varnishlog/Makefile.am index 05dbcbe..7c10b8f 100644 --- a/bin/varnishlog/Makefile.am +++ b/bin/varnishlog/Makefile.am @@ -47,7 +47,7 @@ varnishlog.1: \ varnishlog_options.rst \ varnishlog_synopsis.rst if HAVE_RST2MAN - ${RST2MAN} $? $@ + cat $? | ${RST2MAN} - $@ else @echo "========================================" @echo "You need rst2man installed to make dist" From phk at varnish-cache.org Fri Jun 14 10:15:52 2013 From: phk at varnish-cache.org (Poul-Henning Kamp) Date: Fri, 14 Jun 2013 12:15:52 +0200 Subject: [master] 5c7f22e Specify the filename directly, there seems to be no common macro between bmake and gmake that can do this for us. Message-ID: commit 5c7f22edd2b06bda9ab40db7011bf0ed2edf4cf3 Author: Poul-Henning Kamp Date: Fri Jun 14 10:15:26 2013 +0000 Specify the filename directly, there seems to be no common macro between bmake and gmake that can do this for us. diff --git a/bin/varnishlog/Makefile.am b/bin/varnishlog/Makefile.am index 7c10b8f..a757b61 100644 --- a/bin/varnishlog/Makefile.am +++ b/bin/varnishlog/Makefile.am @@ -47,7 +47,7 @@ varnishlog.1: \ varnishlog_options.rst \ varnishlog_synopsis.rst if HAVE_RST2MAN - cat $? | ${RST2MAN} - $@ + ${RST2MAN} $(top_srcdir)/doc/sphinx/reference/varnishlog.rst $@ else @echo "========================================" @echo "You need rst2man installed to make dist" From martin at varnish-cache.org Fri Jun 14 11:59:19 2013 From: martin at varnish-cache.org (Martin Blix Grydeland) Date: Fri, 14 Jun 2013 13:59:19 +0200 Subject: [master] b979040 Fix 'make distcheck' on freebsd Message-ID: commit b9790408c99eb980ff6bff200f716b1a65643de6 Author: Martin Blix Grydeland Date: Fri Jun 14 13:58:56 2013 +0200 Fix 'make distcheck' on freebsd diff --git a/doc/sphinx/Makefile.am b/doc/sphinx/Makefile.am index 579b6c8..9818e81 100644 --- a/doc/sphinx/Makefile.am +++ b/doc/sphinx/Makefile.am @@ -161,7 +161,8 @@ EXTRA_DIST = \ users-guide/vcl-saint-and-grace.rst \ users-guide/vcl-syntax.rst \ users-guide/vcl-variables.rst \ - users-guide/websockets.rst + users-guide/websockets.rst \ + include/vcl-backends.rst dist-hook: diff --git a/lib/libvcl/generate.py b/lib/libvcl/generate.py index 40fd1e1..c4a8a49 100755 --- a/lib/libvcl/generate.py +++ b/lib/libvcl/generate.py @@ -447,7 +447,7 @@ vcltypes = { 'STRING_LIST': "void*", } -fi = open(buildroot + "/include/vrt.h") +fi = open(srcroot + "/include/vrt.h") for i in fi: j = i.split(); From martin at varnish-cache.org Fri Jun 14 14:11:30 2013 From: martin at varnish-cache.org (Martin Blix Grydeland) Date: Fri, 14 Jun 2013 16:11:30 +0200 Subject: [master] 63f7dfa Add logexpect to varnishtest. Message-ID: commit 63f7dfa0b52d09f8da86c97078a2e496a4f59b83 Author: Martin Blix Grydeland Date: Wed Jun 5 11:13:06 2013 +0200 Add logexpect to varnishtest. This allows checking order and contents of VSL records in varnishtest. diff --git a/bin/varnishtest/Makefile.am b/bin/varnishtest/Makefile.am index b26e3cb..2ea62f5 100644 --- a/bin/varnishtest/Makefile.am +++ b/bin/varnishtest/Makefile.am @@ -30,7 +30,8 @@ varnishtest_SOURCES = \ vtc_log.c \ vtc_sema.c \ vtc_server.c \ - vtc_varnish.c + vtc_varnish.c \ + vtc_logexp.c varnishtest_LDADD = \ $(top_builddir)/lib/libvarnish/libvarnish.la \ diff --git a/bin/varnishtest/tests/README b/bin/varnishtest/tests/README index dc08fc5..c181fcc 100644 --- a/bin/varnishtest/tests/README +++ b/bin/varnishtest/tests/README @@ -18,6 +18,7 @@ Naming scheme id ~ [c] --> Complex functionality tests id ~ [e] --> ESI tests id ~ [g] --> GZIP tests + id ~ [l] --> VSL tests id ~ [m] --> VMOD tests id ~ [p] --> Persistent tests id ~ [r] --> Regression tests, same number as ticket diff --git a/bin/varnishtest/tests/l00000.vtc b/bin/varnishtest/tests/l00000.vtc new file mode 100644 index 0000000..90d9a7a --- /dev/null +++ b/bin/varnishtest/tests/l00000.vtc @@ -0,0 +1,50 @@ +varnishtest "test logexpect" + +server s1 { + rxreq + txresp +} -start + +varnish v1 -vcl+backend { +} -start + +logexpect l1 -v v1 -g session { + expect 0 1000 Begin sess + expect 0 = SessOpen + expect 0 = Link "req 1001" + expect 0 = SessClose + expect 0 = End + + expect 0 * Begin "req 1000" + expect 0 = ReqMethod GET + expect 0 = ReqURL / + expect 0 = ReqProtocol HTTP/1.1 + expect * = ReqHeader "Foo: bar" + expect * = Link bereq + expect * = ReqEnd + expect 0 = End + + expect 0 1002 Begin "bereq 1001" + expect * = End +} -start + +# Check with a query (this selects only the backend request) +logexpect l2 -v v1 -g vxid -q "bereq 1001" { + expect 0 1002 Begin + expect * = End +} -start + +client c1 { + txreq -hdr "Foo: bar" + rxresp + expect resp.status == 200 +} -run + +logexpect l1 -wait +logexpect l2 -wait + +# Check -d arg +logexpect l1 -d 1 { + expect 0 1000 Begin sess + expect * = SessClose +} -run diff --git a/bin/varnishtest/vtc.c b/bin/varnishtest/vtc.c index aeb4296..bea1298 100644 --- a/bin/varnishtest/vtc.c +++ b/bin/varnishtest/vtc.c @@ -535,6 +535,7 @@ static const struct cmds cmds[] = { { "sema", cmd_sema }, { "random", cmd_random }, { "feature", cmd_feature }, + { "logexpect", cmd_logexp }, { NULL, NULL } }; diff --git a/bin/varnishtest/vtc.h b/bin/varnishtest/vtc.h index 333a1df..85763df 100644 --- a/bin/varnishtest/vtc.h +++ b/bin/varnishtest/vtc.h @@ -63,6 +63,7 @@ cmd_f cmd_server; cmd_f cmd_client; cmd_f cmd_varnish; cmd_f cmd_sema; +cmd_f cmd_logexp; extern volatile sig_atomic_t vtc_error; /* Error, bail out */ extern int vtc_stop; /* Abandon current test, no error */ diff --git a/bin/varnishtest/vtc_logexp.c b/bin/varnishtest/vtc_logexp.c new file mode 100644 index 0000000..fc58cae --- /dev/null +++ b/bin/varnishtest/vtc_logexp.c @@ -0,0 +1,579 @@ +/*- + * Copyright (c) 2008-2013 Varnish Software AS + * All rights reserved. + * + * Author: Martin Blix Grydeland + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND + * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE + * ARE DISCLAIMED. IN NO EVENT SHALL AUTHOR OR CONTRIBUTORS BE LIABLE + * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL + * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS + * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) + * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT + * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY + * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF + * SUCH DAMAGE. + */ + +/* + * Synopsis: + * -v + * -d <0|1> (head/tail mode) + * -g + * -q + * + * logexpect lN -v [-g ] [-d 0|1] [-q query] { + * expect + * } + * + * skip: [uint|*] Max number of record to skip + * vxid: [uint|*|=] vxid to match + * tag: [tagname|*|=] Tag to match against + * regex: regular expression to match against (optional) + * *: Match anything + * =: Match value of last examined log record + */ + +#include "config.h" + +#include +#include +#include +#include +#include + +#include "vapi/vsm.h" +#include "vapi/vsl.h" +#include "vtim.h" +#include "vqueue.h" +#include "miniobj.h" +#include "vas.h" +#include "vre.h" + +#include "vtc.h" + +#define LE_ANY (-1) +#define LE_LAST (-2) + +struct logexp_test { + unsigned magic; +#define LOGEXP_TEST_MAGIC 0x6F62B350 + VTAILQ_ENTRY(logexp_test) list; + + struct vsb *str; + int vxid; + int tag; + vre_t *vre; + int skip_max; +}; + +struct logexp { + unsigned magic; +#define LOGEXP_MAGIC 0xE81D9F1B + VTAILQ_ENTRY(logexp) list; + + char *name; + struct vtclog *vl; + char run; + char *spec; + VTAILQ_HEAD(,logexp_test) tests; + + struct logexp_test *test; + int skip_cnt; + int vxid_last; + int tag_last; + + int d_arg; + int g_arg; + char *query; + + struct VSM_data *vsm; + struct VSL_data *vsl; + struct VSLQ *vslq; + pthread_t tp; +}; + +#define VSL_SLEEP_USEC (50*1000) + +static VTAILQ_HEAD(, logexp) logexps = + VTAILQ_HEAD_INITIALIZER(logexps); + +static void +logexp_delete_tests(struct logexp *le) +{ + struct logexp_test *test; + + CHECK_OBJ_NOTNULL(le, LOGEXP_MAGIC); + while ((test = VTAILQ_FIRST(&le->tests))) { + CHECK_OBJ_NOTNULL(test, LOGEXP_TEST_MAGIC); + VTAILQ_REMOVE(&le->tests, test, list); + VSB_delete(test->str); + if (test->vre) + VRE_free(&test->vre); + FREE_OBJ(test); + } +} + +static void +logexp_delete(struct logexp *le) +{ + CHECK_OBJ_NOTNULL(le, LOGEXP_MAGIC); + AZ(le->run); + AZ(le->vsl); + AZ(le->vslq); + logexp_delete_tests(le); + free(le->name); + free(le->query); + VSM_Delete(le->vsm); + FREE_OBJ(le); +} + +static struct logexp * +logexp_new(const char *name) +{ + struct logexp *le; + + AN(name); + ALLOC_OBJ(le, LOGEXP_MAGIC); + AN(le); + REPLACE(le->name, name); + le->vl = vtc_logopen(name); + VTAILQ_INIT(&le->tests); + + le->d_arg = 0; + le->g_arg = VSL_g_vxid; + le->vsm = VSM_New(); + AN(le->vsm); + + VTAILQ_INSERT_TAIL(&logexps, le, list); + return (le); +} + +static void +logexp_next(struct logexp *le) +{ + CHECK_OBJ_NOTNULL(le, LOGEXP_MAGIC); + + if (le->test) { + CHECK_OBJ_NOTNULL(le->test, LOGEXP_TEST_MAGIC); + le->test = VTAILQ_NEXT(le->test, list); + } else + le->test = VTAILQ_FIRST(&le->tests); + + CHECK_OBJ_ORNULL(le->test, LOGEXP_TEST_MAGIC); + if (le->test) + vtc_log(le->vl, 3, "tst| %s", VSB_data(le->test->str)); +} + +static int +logexp_dispatch(struct VSL_data *vsl, struct VSL_transaction *pt[], void *priv) +{ + struct logexp *le; + struct VSL_transaction *t; + int i; + int ok, skip; + int vxid, tag, type, len, lvl; + const char *legend, *data; + + (void)vsl; + CAST_OBJ_NOTNULL(le, priv, LOGEXP_MAGIC); + + for (i = 0; (t = pt[i]); i++) { + while (1 == VSL_Next(t->c)) { + CHECK_OBJ_NOTNULL(le->test, LOGEXP_TEST_MAGIC); + AN(t->c->rec.ptr); + vxid = VSL_ID(t->c->rec.ptr); + tag = VSL_TAG(t->c->rec.ptr); + data = VSL_CDATA(t->c->rec.ptr); + len = VSL_LEN(t->c->rec.ptr); + + if (tag == SLT__Batch) + continue; + + ok = 1; + if (le->test->vxid == LE_LAST) { + if (le->vxid_last != vxid) + ok = 0; + } else if (le->test->vxid >= 0) { + if (le->test->vxid != vxid) + ok = 0; + } + if (le->test->tag == LE_LAST) { + if (le->tag_last != tag) + ok = 0; + } else if (le->test->tag >= 0) { + if (le->test->tag != tag) + ok = 0; + } + if (le->test->vre && + VRE_ERROR_NOMATCH == VRE_exec(le->test->vre, data, + len, 0, 0, NULL, 0, NULL)) + ok = 0; + + skip = 0; + if (!ok && (le->test->skip_max == LE_ANY || + le->test->skip_max > le->skip_cnt)) + skip = 1; + + if (ok) { + lvl = 4; + legend = "ok"; + } else if (skip) { + lvl = 4; + legend = "skp"; + } else { + lvl = 0; + legend = "err"; + } + type = VSL_CLIENT(t->c->rec.ptr) ? 'c' : + VSL_BACKEND(t->c->rec.ptr) ? 'b' : '-'; + + vtc_log(le->vl, lvl, "%3s| %10u %-15s %c %.*s", + legend, vxid, VSL_tags[tag], type, len, data); + + if (ok) { + le->skip_cnt = 0; + logexp_next(le); + if (le->test == NULL) + /* End of test script */ + return (0); + } + if (skip) + le->skip_cnt++; + + le->vxid_last = vxid; + le->tag_last = tag; + } + } + + return (0); +} + +static void * +logexp_thread(void *priv) +{ + struct logexp *le; + int i; + + CAST_OBJ_NOTNULL(le, priv, LOGEXP_MAGIC); + AN(le->run); + AN(le->vsm); + AN(le->vslq); + + i = 0; + AZ(le->test); + logexp_next(le); + while (le->test) { + i = VSLQ_Dispatch(le->vslq, &logexp_dispatch, le); + if (i < 0) + vtc_log(le->vl, 0, "dispatch: %d", i); + if (i == 0 && le->test) + VTIM_sleep(0.01); + } + vtc_log(le->vl, 4, "end of test script"); + + return (NULL); +} + +static void +logexp_close(struct logexp *le) +{ + + CHECK_OBJ_NOTNULL(le, LOGEXP_MAGIC); + AN(le->vsm); + if (le->vslq) + VSLQ_Delete(&le->vslq); + AZ(le->vslq); + if (le->vsl) { + VSL_Delete(le->vsl); + le->vsl = NULL; + } + VSM_Close(le->vsm); +} + +static void +logexp_start(struct logexp *le) +{ + struct VSL_cursor *c; + + CHECK_OBJ_NOTNULL(le, LOGEXP_MAGIC); + AZ(le->vsl); + AZ(le->vslq); + + if (VSM_Open(le->vsm)) { + vtc_log(le->vl, 0, "VSM_Open: %s", VSM_Error(le->vsm)); + return; + } + le->vsl = VSL_New(); + AN(le->vsl); + c = VSL_CursorVSM(le->vsl, le->vsm, !le->d_arg); + if (c == NULL) { + vtc_log(le->vl, 0, "VSL_CursorVSM: %s", VSL_Error(le->vsl)); + logexp_close(le); + return; + } + le->vslq = VSLQ_New(le->vsl, &c, le->g_arg, le->query); + if (le->vslq == NULL) { + VSL_DeleteCursor(c); + vtc_log(le->vl, 0, "VSLQ_New: %s", VSL_Error(le->vsl)); + AZ(le->vslq); + logexp_close(le); + return; + } + AZ(c); + + le->test = NULL; + le->skip_cnt = 0; + le->vxid_last = le->tag_last = -1; + le->run = 1; + AZ(pthread_create(&le->tp, NULL, logexp_thread, le)); +} + +static void +logexp_wait(struct logexp *le) +{ + void *res; + + CHECK_OBJ_NOTNULL(le, LOGEXP_MAGIC); + vtc_log(le->vl, 2, "Waiting for logexp"); + AZ(pthread_join(le->tp, &res)); + logexp_close(le); + if (res != NULL && !vtc_stop) + vtc_log(le->vl, 0, "logexp returned \"%p\"", (char *)res); + le->run = 0; +} + +static void +cmd_logexp_expect(CMD_ARGS) +{ + struct logexp *le; + int skip_max; + int vxid; + int tag; + vre_t *vre; + const char *err; + int pos; + struct logexp_test *test; + char *end; + + (void)cmd; + CAST_OBJ_NOTNULL(le, priv, LOGEXP_MAGIC); + if (av[1] == NULL || av[2] == NULL || av[3] == NULL) { + vtc_log(vl, 0, "Syntax error"); + return; + } + + if (!strcmp(av[1], "*")) + skip_max = LE_ANY; + else { + skip_max = (int)strtol(av[1], &end, 10); + if (*end != '\0' || skip_max < 0) { + vtc_log(vl, 0, "Not a positive integer: '%s'", av[1]); + return; + } + } + if (!strcmp(av[2], "*")) + vxid = LE_ANY; + else if (!strcmp(av[2], "=")) + vxid = LE_LAST; + else { + vxid = (int)strtol(av[2], &end, 10); + if (*end != '\0' || vxid < 0) { + vtc_log(vl, 0, "Not a positive integer: '%s'", av[2]); + return; + } + } + if (!strcmp(av[3], "*")) + tag = LE_ANY; + else if (!strcmp(av[3], "=")) + tag = LE_LAST; + else { + tag = VSL_Name2Tag(av[3], strlen(av[3])); + if (tag < 0) { + vtc_log(vl, 0, "Unknown tag name: '%s'", av[3]); + return; + } + } + vre = NULL; + if (av[4]) { + vre = VRE_compile(av[4], 0, &err, &pos); + if (vre == NULL) { + vtc_log(vl, 0, "Regex error (%s): '%s' pos %d", + err, av[4], pos); + return; + } + } + + ALLOC_OBJ(test, LOGEXP_TEST_MAGIC); + AN(test); + test->str = VSB_new_auto(); + AN(test->str); + AZ(VSB_printf(test->str, "%s %s %s %s ", av[0], av[1], av[2], av[3])); + if (av[4]) + VSB_quote(test->str, av[4], -1, 0); + AZ(VSB_finish(test->str)); + test->skip_max = skip_max; + test->vxid = vxid; + test->tag = tag; + test->vre = vre; + VTAILQ_INSERT_TAIL(&le->tests, test, list); + vtc_log(vl, 4, "%s", VSB_data(test->str)); +} + +static const struct cmds logexp_cmds[] = { + { "expect", cmd_logexp_expect }, + { NULL, NULL }, +}; + +static void +logexp_spec(struct logexp *le, const char *spec) +{ + char *s; + + CHECK_OBJ_NOTNULL(le, LOGEXP_MAGIC); + + logexp_delete_tests(le); + + s = strdup(spec); + AN(s); + parse_string(s, logexp_cmds, le, le->vl); + free(s); +} + +void +cmd_logexp(CMD_ARGS) +{ + struct logexp *le, *le2; + const char tmpdir[] = "${tmpdir}"; + struct vsb *vsb, *vsb2; + + (void)priv; + (void)cmd; + (void)vl; + + if (av == NULL) { + /* Reset and free */ + VTAILQ_FOREACH_SAFE(le, &logexps, list, le2) { + CHECK_OBJ_NOTNULL(le, LOGEXP_MAGIC); + VTAILQ_REMOVE(&logexps, le, list); + if (le->run) { + (void)pthread_cancel(le->tp); + logexp_wait(le); + } + logexp_delete(le); + } + return; + } + + assert(!strcmp(av[0], "logexpect")); + av++; + + VTAILQ_FOREACH(le, &logexps, list) { + if (!strcmp(le->name, av[0])) + break; + } + if (le == NULL) + le = logexp_new(av[0]); + av++; + + for (; *av != NULL; av++) { + if (vtc_error) + break; + if (!strcmp(*av, "-wait")) { + if (!le->run) { + vtc_log(le->vl, 0, "logexp not -started '%s'", + *av); + return; + } + logexp_wait(le); + continue; + } + + /* + * We do an implict -wait if people muck about with a + * running logexp. + */ + if (le->run) + logexp_wait(le); + AZ(le->run); + + if (!strcmp(*av, "-v")) { + if (av[1] == NULL) { + vtc_log(le->vl, 0, "Missing -v argument"); + return; + } + vsb = VSB_new_auto(); + AZ(VSB_printf(vsb, "%s/%s", tmpdir, av[1])); + AZ(VSB_finish(vsb)); + vsb2 = macro_expand(le->vl, VSB_data(vsb)); + VSB_delete(vsb); + if (vsb2 == NULL) + return; + if (VSM_n_Arg(le->vsm, VSB_data(vsb2)) <= 0) { + vtc_log(le->vl, 0, "-v argument error: %s", + VSM_Error(le->vsm)); + return; + } + VSB_delete(vsb2); + av++; + continue; + } + if (!strcmp(*av, "-d")) { + if (av[1] == NULL) { + vtc_log(le->vl, 0, "Missing -d argument"); + return; + } + le->d_arg = atoi(av[1]); + av++; + continue; + } + if (!strcmp(*av, "-g")) { + if (av[1] == NULL) { + vtc_log(le->vl, 0, "Missing -g argument"); + return; + } + le->g_arg = VSLQ_Name2Grouping(av[1], strlen(av[1])); + if (le->g_arg < 0) { + vtc_log(le->vl, 0, "Unknown grouping '%s'", + av[1]); + return; + } + av++; + continue; + } + if (!strcmp(*av, "-q")) { + if (av[1] == NULL) { + vtc_log(le->vl, 0, "Missing -q argument"); + return; + } + REPLACE(le->query, av[1]); + av++; + continue; + } + if (!strcmp(*av, "-start")) { + logexp_start(le); + continue; + } + if (!strcmp(*av, "-run")) { + logexp_start(le); + logexp_wait(le); + continue; + } + if (**av == '-') { + vtc_log(le->vl, 0, "Unknown logexp argument: %s", *av); + return; + } + logexp_spec(le, *av); + } +} From phk at varnish-cache.org Fri Jun 14 20:23:39 2013 From: phk at varnish-cache.org (Poul-Henning Kamp) Date: Fri, 14 Jun 2013 22:23:39 +0200 Subject: [master] c9d1478 Be a bit more helpful with the VCC error messages with the names which are restricted to valid C language names. Message-ID: commit c9d1478f0cce0891896bcaf03bdf67ae5bd81594 Author: Poul-Henning Kamp Date: Fri Jun 14 20:22:38 2013 +0000 Be a bit more helpful with the VCC error messages with the names which are restricted to valid C language names. Fixes #1310 diff --git a/bin/varnishtest/tests/m00010.vtc b/bin/varnishtest/tests/m00010.vtc index 63ba40d..c4c1b3d 100644 --- a/bin/varnishtest/tests/m00010.vtc +++ b/bin/varnishtest/tests/m00010.vtc @@ -22,6 +22,13 @@ server s4 { txresp -body "4444" } -start +varnish v1 -errvcl {Names of VCL objects cannot contain '-'} { + import directors from "${topbuild}/lib/libvmod_directors/.libs/libvmod_directors.so" ; + backend b1 { .host = "127.0.0.1"; .port = "8080";} + sub vcl_init { + new rr1-xx = directors.round_robin(); + } +} varnish v1 -vcl+backend { diff --git a/bin/varnishtest/tests/v00020.vtc b/bin/varnishtest/tests/v00020.vtc index 2977c85..2cf3f94 100644 --- a/bin/varnishtest/tests/v00020.vtc +++ b/bin/varnishtest/tests/v00020.vtc @@ -247,3 +247,31 @@ varnish v1 -vcl { } } } + +varnish v1 -errvcl {Names of VCL sub's cannot contain '-'} { + backend b { .host = "127.0.0.1"; } + sub foo-bar { + } + sub vcl_recv { + call foo-bar; + } +} + +varnish v1 -errvcl {VCL sub's named 'vcl*' are reserved names.} { + backend b { .host = "127.0.0.1"; } + sub vcl_bar { + } + sub vcl_recv { + call vcl_bar; + } +} + +varnish v1 -errvcl {Names of VCL acl's cannot contain '-'} { + backend b { .host = "127.0.0.1"; } + acl foo-bar { + } + sub vcl_recv { + if (client.ip ~ foo.bar) { + } + } +} diff --git a/lib/libvcl/vcc_acl.c b/lib/libvcl/vcc_acl.c index eb3bace..fa8a7d7 100644 --- a/lib/libvcl/vcc_acl.c +++ b/lib/libvcl/vcc_acl.c @@ -472,6 +472,12 @@ vcc_Acl(struct vcc *tl) VTAILQ_INIT(&tl->acl); ExpectErr(tl, ID); + if (!vcc_isCid(tl->t)) { + VSB_printf(tl->sb, + "Names of VCL acl's cannot contain '-'\n"); + vcc_ErrWhere(tl, tl->t); + return; + } an = tl->t; vcc_NextToken(tl); diff --git a/lib/libvcl/vcc_action.c b/lib/libvcl/vcc_action.c index f53248b..e6a3f6c 100644 --- a/lib/libvcl/vcc_action.c +++ b/lib/libvcl/vcc_action.c @@ -171,6 +171,12 @@ parse_new(struct vcc *tl) vcc_NextToken(tl); ExpectErr(tl, ID); + if (!vcc_isCid(tl->t)) { + VSB_printf(tl->sb, + "Names of VCL objects cannot contain '-'\n"); + vcc_ErrWhere(tl, tl->t); + return; + } sy1 = VCC_FindSymbol(tl, tl->t, SYM_NONE); XXXAZ(sy1); diff --git a/lib/libvcl/vcc_compile.c b/lib/libvcl/vcc_compile.c index b04236f..c34ffc4 100644 --- a/lib/libvcl/vcc_compile.c +++ b/lib/libvcl/vcc_compile.c @@ -133,6 +133,10 @@ IsMethod(const struct token *t) if (vcc_IdIs(t, m->name)) return (m - method_tab); } + if ((t->b[0] == 'v'|| t->b[0] == 'V') && + (t->b[1] == 'c'|| t->b[1] == 'C') && + (t->b[2] == 'l'|| t->b[2] == 'L')) + return (-2); return (-1); } diff --git a/lib/libvcl/vcc_compile.h b/lib/libvcl/vcc_compile.h index fb3a969..64b5995 100644 --- a/lib/libvcl/vcc_compile.h +++ b/lib/libvcl/vcc_compile.h @@ -49,10 +49,12 @@ #endif struct vsb; +struct token; #define isident1(c) (isalpha(c)) #define isident(c) (isalpha(c) || isdigit(c) || (c) == '_' || (c) == '-') #define isvar(c) (isident(c) || (c) == '.') +int vcc_isCid(const struct token *t); unsigned vcl_fixed_token(const char *p, const char **q); extern const char * const vcl_tnames[256]; void vcl_output_lang_h(struct vsb *sb); diff --git a/lib/libvcl/vcc_parse.c b/lib/libvcl/vcc_parse.c index ff70d64..e334846 100644 --- a/lib/libvcl/vcc_parse.c +++ b/lib/libvcl/vcc_parse.c @@ -212,9 +212,23 @@ vcc_Function(struct vcc *tl) vcc_NextToken(tl); ExpectErr(tl, ID); + if (!vcc_isCid(tl->t)) { + VSB_printf(tl->sb, + "Names of VCL sub's cannot contain '-'\n"); + vcc_ErrWhere(tl, tl->t); + return; + } m = IsMethod(tl->t); - if (m != -1) { + if (m == -2) { + VSB_printf(tl->sb, + "VCL sub's named 'vcl*' are reserved names.\n"); + vcc_ErrWhere(tl, tl->t); + VSB_printf(tl->sb, "Valid vcl_* methods are:\n"); + for (i = 0; method_tab[i].name != NULL; i++) + VSB_printf(tl->sb, "\t%s\n", method_tab[i].name); + return; + } else if (m != -1) { assert(m < VCL_MET_MAX); tl->fb = tl->fm[m]; if (tl->mprocs[m] == NULL) { diff --git a/lib/libvcl/vcc_token.c b/lib/libvcl/vcc_token.c index 577b784..6201fea 100644 --- a/lib/libvcl/vcc_token.c +++ b/lib/libvcl/vcc_token.c @@ -293,7 +293,7 @@ vcc_IdIs(const struct token *t, const char *p) * Check that we have a C-identifier */ -static int +int vcc_isCid(const struct token *t) { const char *q; From phk at varnish-cache.org Sun Jun 16 09:58:48 2013 From: phk at varnish-cache.org (Poul-Henning Kamp) Date: Sun, 16 Jun 2013 11:58:48 +0200 Subject: [master] 0b7c06f Start injecting the fetch state-engine into the fetch-code Message-ID: commit 0b7c06fb13620ccc974ed6cadcc8d60815115453 Author: Poul-Henning Kamp Date: Sun Jun 16 09:58:14 2013 +0000 Start injecting the fetch state-engine into the fetch-code diff --git a/bin/varnishd/cache/cache.h b/bin/varnishd/cache/cache.h index 77ef0d5..53a397f 100644 --- a/bin/varnishd/cache/cache.h +++ b/bin/varnishd/cache/cache.h @@ -153,6 +153,12 @@ enum req_step { #undef REQ_STEP }; +enum fetch_step { +#define FETCH_STEP(l, U, arg) F_STP_##U, +#include "tbl/steps.h" +#undef FETCH_STEP +}; + /*--------------------------------------------------------------------*/ struct lock { void *priv; }; // Opaque @@ -495,6 +501,7 @@ struct busyobj { #define BUSYOBJ_MAGIC 0x23b95567 struct lock mtx; char *end; + enum fetch_step step; /* * All fields from refcount and down are zeroed when the busyobj diff --git a/bin/varnishd/cache/cache_fetch.c b/bin/varnishd/cache/cache_fetch.c index 9acea11..ae74bd9 100644 --- a/bin/varnishd/cache/cache_fetch.c +++ b/bin/varnishd/cache/cache_fetch.c @@ -771,21 +771,15 @@ vbf_proc_resp(struct worker *wrk, struct busyobj *bo) } -struct vbf_secret_handshake { - unsigned magic; -#define VBF_SECRET_HANDSHAKE_MAGIC 0x98c95172 - struct busyobj *bo; - struct req **reqp; -}; +/*-------------------------------------------------------------------- + */ -static void -vbf_fetch_thread(struct worker *wrk, void *priv) +static enum fetch_step +vbf_stp_fetch(struct worker *wrk, struct busyobj *bo, struct req **reqp) { - struct vbf_secret_handshake *vsh; - struct busyobj *bo; - struct req *req; int i; struct http *hp, *hp2; + struct req *req; char *b; uint16_t nhttp; unsigned l; @@ -795,13 +789,10 @@ vbf_fetch_thread(struct worker *wrk, void *priv) CHECK_OBJ_NOTNULL(wrk, WORKER_MAGIC); - CAST_OBJ_NOTNULL(vsh, priv, VBF_SECRET_HANDSHAKE_MAGIC); - AN(vsh->reqp); - req = *vsh->reqp; + AN(reqp); + req = *reqp; CHECK_OBJ_NOTNULL(req, REQ_MAGIC); - - bo = vsh->bo; - THR_SetBusyobj(bo); + CHECK_OBJ_NOTNULL(bo, BUSYOBJ_MAGIC); vbf_make_bereq(wrk, req, bo); xxxassert (wrk->handling == VCL_RET_FETCH); @@ -810,10 +801,8 @@ vbf_fetch_thread(struct worker *wrk, void *priv) if (!bo->do_pass) { AN(req); - AN(vsh); req = NULL; - *vsh->reqp = NULL; - vsh = NULL; + *reqp = NULL; } i = vbf_fetch_hdr(wrk, bo, req); @@ -829,13 +818,10 @@ vbf_fetch_thread(struct worker *wrk, void *priv) if (bo->do_pass) { AN(req); - AN(vsh); req = NULL; - *vsh->reqp = NULL; - vsh = NULL; + *reqp = NULL; } AZ(req); - AZ(vsh); if (i) { wrk->handling = VCL_RET_ERROR; @@ -865,8 +851,7 @@ vbf_fetch_thread(struct worker *wrk, void *priv) case VCL_RET_ERROR: bo->state = BOS_FAILED; VBO_DerefBusyObj(wrk, &bo); // XXX ? - THR_SetBusyobj(NULL); - return; + return (F_STP_DONE); case VCL_RET_RESTART: INCOMPL(); default: @@ -897,10 +882,7 @@ vbf_fetch_thread(struct worker *wrk, void *priv) AZ(HSH_Deref(&wrk->stats, bo->fetch_objcore, NULL)); bo->fetch_objcore = NULL; VDI_CloseFd(&bo->vbc); - bo->state = BOS_FAILED; -VSL_Flush(bo->vsl, 0); - THR_SetBusyobj(NULL); - return; + return (F_STP_ABANDON); } else /* No vary */ AZ(vary); @@ -934,10 +916,7 @@ VSL_Flush(bo->vsl, 0); AZ(HSH_Deref(&wrk->stats, bo->fetch_objcore, NULL)); bo->fetch_objcore = NULL; VDI_CloseFd(&bo->vbc); - bo->state = BOS_FAILED; - VBO_DerefBusyObj(wrk, &bo); // XXX ? - THR_SetBusyobj(NULL); - return; + return (F_STP_ABANDON); } CHECK_OBJ_NOTNULL(obj, OBJECT_MAGIC); @@ -998,17 +977,83 @@ VSL_Flush(bo->vsl, 0); if (bo->state == BOS_FAILED) { /* handle early failures */ - VBO_DerefBusyObj(wrk, &bo); // XXX ? (void)HSH_Deref(&wrk->stats, NULL, &obj); - THR_SetBusyobj(NULL); - return; + return (F_STP_ABANDON); } VBO_DerefBusyObj(wrk, &bo); // XXX ? + return (F_STP_DONE); +} + +/*-------------------------------------------------------------------- + */ + +static enum fetch_step +vbf_stp_abandon(struct worker *wrk, struct busyobj *bo) +{ + CHECK_OBJ_NOTNULL(wrk, WORKER_MAGIC); + CHECK_OBJ_NOTNULL(bo, BUSYOBJ_MAGIC); + + bo->state = BOS_FAILED; + VBO_DerefBusyObj(wrk, &bo); // XXX ? + return (F_STP_DONE); +} + +/*-------------------------------------------------------------------- + */ + +static enum fetch_step +vbf_stp_done(void) +{ + WRONG("Just plain wrong"); +} + +/*-------------------------------------------------------------------- + */ + +struct vbf_secret_handshake { + unsigned magic; +#define VBF_SECRET_HANDSHAKE_MAGIC 0x98c95172 + struct busyobj *bo; + struct req **reqp; +}; + +static void +vbf_fetch_thread(struct worker *wrk, void *priv) +{ + struct vbf_secret_handshake *vsh; + struct busyobj *bo; + struct req **reqp; + + CHECK_OBJ_NOTNULL(wrk, WORKER_MAGIC); + CAST_OBJ_NOTNULL(vsh, priv, VBF_SECRET_HANDSHAKE_MAGIC); + AN(vsh->reqp); + reqp = vsh->reqp; + CHECK_OBJ_NOTNULL((*vsh->reqp), REQ_MAGIC); + + bo = vsh->bo; + THR_SetBusyobj(bo); + bo->step = F_STP_FETCH; + + while (bo->step != F_STP_DONE) { + switch(bo->step) { +#define FETCH_STEP(l, U, arg) \ + case F_STP_##U: \ + bo->step = vbf_stp_##l arg; \ + break; +#include "tbl/steps.h" +#undef FETCH_STEP + default: + WRONG("Illegal fetch_step"); + } + } assert(WRW_IsReleased(wrk)); THR_SetBusyobj(NULL); } +/*-------------------------------------------------------------------- + */ + void VBF_Fetch(struct worker *wrk, struct req *req) { diff --git a/include/tbl/steps.h b/include/tbl/steps.h index 6c3f3f6..32af092 100644 --- a/include/tbl/steps.h +++ b/include/tbl/steps.h @@ -48,4 +48,11 @@ REQ_STEP(prepresp, PREPRESP, (wrk, req)) REQ_STEP(deliver, DELIVER, (wrk, req)) REQ_STEP(error, ERROR, (wrk, req)) #endif + +#ifdef FETCH_STEP +FETCH_STEP(fetch, FETCH, (wrk, bo, reqp)) +FETCH_STEP(abandon, ABANDON, (wrk, bo)) +FETCH_STEP(done, DONE, ()) +#endif + /*lint -restore */ From phk at varnish-cache.org Sun Jun 16 13:46:06 2013 From: phk at varnish-cache.org (Poul-Henning Kamp) Date: Sun, 16 Jun 2013 15:46:06 +0200 Subject: [master] bd40f50 Style nit Message-ID: commit bd40f50d9d4768c194564d3f14f93d2cf7016652 Author: Poul-Henning Kamp Date: Sun Jun 16 12:38:13 2013 +0000 Style nit diff --git a/bin/varnishd/cache/cache_rfc2616.c b/bin/varnishd/cache/cache_rfc2616.c index d0d0d87..5b887a2 100644 --- a/bin/varnishd/cache/cache_rfc2616.c +++ b/bin/varnishd/cache/cache_rfc2616.c @@ -240,8 +240,7 @@ RFC2616_Body(struct busyobj *bo, struct dstat *stats) return (BS_ERROR); } - if (http_GetHdr(hp, H_Content_Length, - &bo->h_content_length)) { + if (http_GetHdr(hp, H_Content_Length, &bo->h_content_length)) { stats->fetch_length++; return (BS_LENGTH); } From phk at varnish-cache.org Sun Jun 16 13:46:06 2013 From: phk at varnish-cache.org (Poul-Henning Kamp) Date: Sun, 16 Jun 2013 15:46:06 +0200 Subject: [master] cfbc169 Don't attempt to fetch nothing Message-ID: commit cfbc1691c5d5fe6e8efdfa25552f90bf1fb82a9c Author: Poul-Henning Kamp Date: Sun Jun 16 12:44:18 2013 +0000 Don't attempt to fetch nothing diff --git a/bin/varnishd/cache/cache_fetch.c b/bin/varnishd/cache/cache_fetch.c index ae74bd9..b65c60a 100644 --- a/bin/varnishd/cache/cache_fetch.c +++ b/bin/varnishd/cache/cache_fetch.c @@ -524,8 +524,8 @@ vbf_fetch_body(struct worker *wrk, void *priv) case BS_LENGTH: cl = vbf_fetch_number(bo->h_content_length, 10); - bo->vfp->begin(bo, cl > 0 ? cl : 0); - if (bo->state == BOS_FETCHING) + bo->vfp->begin(bo, cl); + if (bo->state == BOS_FETCHING && cl > 0) cls = vbf_fetch_straight(bo, htc, cl); mklen = 1; if (bo->vfp->end(bo)) From phk at varnish-cache.org Sun Jun 16 13:46:06 2013 From: phk at varnish-cache.org (Poul-Henning Kamp) Date: Sun, 16 Jun 2013 15:46:06 +0200 Subject: [master] d124132 Synthesize a 503 when we fail to fetch, and pass it on. Message-ID: commit d1241323f95e45a30d93b7dde13bcb4d2e5d1246 Author: Poul-Henning Kamp Date: Sun Jun 16 13:42:31 2013 +0000 Synthesize a 503 when we fail to fetch, and pass it on. diff --git a/bin/varnishd/cache/cache_fetch.c b/bin/varnishd/cache/cache_fetch.c index b65c60a..4ef7991 100644 --- a/bin/varnishd/cache/cache_fetch.c +++ b/bin/varnishd/cache/cache_fetch.c @@ -486,7 +486,8 @@ vbf_fetch_body(struct worker *wrk, void *priv) CHECK_OBJ_NOTNULL(wrk, WORKER_MAGIC); CAST_OBJ_NOTNULL(bo, priv, BUSYOBJ_MAGIC); - CHECK_OBJ_NOTNULL(bo->vbc, VBC_MAGIC); + htc = &bo->htc; + CHECK_OBJ_ORNULL(bo->vbc, VBC_MAGIC); obj = bo->fetch_obj; CHECK_OBJ_NOTNULL(obj, OBJECT_MAGIC); CHECK_OBJ_NOTNULL(obj->http, HTTP_MAGIC); @@ -500,7 +501,6 @@ vbf_fetch_body(struct worker *wrk, void *priv) AZ(bo->stats); bo->stats = &wrk->stats; - htc = &bo->htc; if (bo->vfp == NULL) bo->vfp = &vfp_nop; @@ -513,7 +513,7 @@ vbf_fetch_body(struct worker *wrk, void *priv) /* XXX: pick up estimate from objdr ? */ cl = 0; - cls = 0; + cls = bo->should_close; switch (htc->body_status) { case BS_NONE: mklen = 0; @@ -526,7 +526,7 @@ vbf_fetch_body(struct worker *wrk, void *priv) bo->vfp->begin(bo, cl); if (bo->state == BOS_FETCHING && cl > 0) - cls = vbf_fetch_straight(bo, htc, cl); + cls |= vbf_fetch_straight(bo, htc, cl); mklen = 1; if (bo->vfp->end(bo)) assert(bo->state == BOS_FAILED); @@ -534,7 +534,7 @@ vbf_fetch_body(struct worker *wrk, void *priv) case BS_CHUNKED: bo->vfp->begin(bo, cl > 0 ? cl : 0); if (bo->state == BOS_FETCHING) - cls = vbf_fetch_chunked(bo, htc); + cls |= vbf_fetch_chunked(bo, htc); mklen = 1; if (bo->vfp->end(bo)) assert(bo->state == BOS_FAILED); @@ -549,11 +549,10 @@ vbf_fetch_body(struct worker *wrk, void *priv) assert(bo->state == BOS_FAILED); break; case BS_ERROR: - cls = VBF_Error(bo, "error incompatible Transfer-Encoding"); + cls |= VBF_Error(bo, "error incompatible Transfer-Encoding"); mklen = 0; break; default: - cls = 0; mklen = 0; INCOMPL(); } @@ -574,18 +573,21 @@ vbf_fetch_body(struct worker *wrk, void *priv) http_Teardown(bo->bereq); http_Teardown(bo->beresp); + if (bo->vbc != NULL) { + if (cls) + VDI_CloseFd(&bo->vbc); + else + VDI_RecycleFd(&bo->vbc); + } + if (bo->state == BOS_FAILED) { wrk->stats.fetch_failed++; - VDI_CloseFd(&bo->vbc); obj->len = 0; EXP_Clr(&obj->exp); EXP_Rearm(obj); } else { assert(bo->state == BOS_FETCHING); - if (cls == 0 && bo->should_close) - cls = 1; - VSLb(bo->vsl, SLT_Length, "%zd", obj->len); { @@ -609,12 +611,6 @@ vbf_fetch_body(struct worker *wrk, void *priv) "Content-Length: %zd", obj->len); } - if (cls) - VDI_CloseFd(&bo->vbc); - else - VDI_RecycleFd(&bo->vbc); - - /* XXX: Atomic assignment, needs volatile/membar ? */ bo->state = BOS_FINISHED; } @@ -627,15 +623,14 @@ vbf_fetch_body(struct worker *wrk, void *priv) * Copy req->bereq and run it by VCL::vcl_backend_fetch{} */ -static void -vbf_make_bereq(struct worker *wrk, const struct req *req, struct busyobj *bo) +static enum fetch_step +vbf_stp_mkbereq(struct worker *wrk, struct busyobj *bo, const struct req *req) { int i; CHECK_OBJ_NOTNULL(wrk, WORKER_MAGIC); CHECK_OBJ_NOTNULL(req, REQ_MAGIC); CHECK_OBJ_NOTNULL(bo, BUSYOBJ_MAGIC); - CHECK_OBJ_NOTNULL(bo->fetch_objcore, OBJCORE_MAGIC); AN(bo->director); AZ(bo->vbc); @@ -666,18 +661,54 @@ vbf_make_bereq(struct worker *wrk, const struct req *req, struct busyobj *bo) http_PrintfHeader(bo->bereq, "X-Varnish: %u", bo->vsl->wid & VSL_IDENTMASK); + return (F_STP_FETCHHDR); } - /*-------------------------------------------------------------------- */ -static void -vbf_proc_resp(struct worker *wrk, struct busyobj *bo) +static enum fetch_step +vbf_stp_fetchhdr(struct worker *wrk, struct busyobj *bo, struct req **reqp) { int i; + CHECK_OBJ_NOTNULL(wrk, WORKER_MAGIC); + AN(reqp); + CHECK_OBJ_NOTNULL((*reqp), REQ_MAGIC); CHECK_OBJ_NOTNULL(bo, BUSYOBJ_MAGIC); + xxxassert (wrk->handling == VCL_RET_FETCH); + + HTTP_Setup(bo->beresp, bo->ws, bo->vsl, HTTP_Beresp); + + if (!bo->do_pass) + *reqp = NULL; + + i = vbf_fetch_hdr(wrk, bo, *reqp); + /* + * If we recycle a backend connection, there is a finite chance + * that the backend closed it before we get a request to it. + * Do a single retry in that case. + */ + if (i == 1) { + VSC_C_main->backend_retry++; + i = vbf_fetch_hdr(wrk, bo, *reqp); + } + + if (bo->do_pass) + *reqp = NULL; + + if (i) { + AZ(bo->vbc); + bo->err_code = 503; + http_SetH(bo->beresp, HTTP_HDR_PROTO, "HTTP/1.1"); + http_SetResp(bo->beresp, + "HTTP/1.1", 503, "Backend fetch failed"); + http_SetHeader(bo->beresp, "Content-Length: 0"); + http_SetHeader(bo->beresp, "Connection: close"); + } else { + AN(bo->vbc); + } + /* * These two headers can be spread over multiple actual headers * and we rely on their content outside of VCL, so collect them @@ -727,6 +758,7 @@ vbf_proc_resp(struct worker *wrk, struct busyobj *bo) * no Content-Encoding --> object is not gzip'ed. * anything else --> do nothing wrt gzip * + * XXX: BS_NONE/cl==0 should avoid gzip/gunzip */ /* We do nothing unless the param is set */ @@ -769,17 +801,18 @@ vbf_proc_resp(struct worker *wrk, struct busyobj *bo) else if (bo->is_gzip) bo->vfp = &vfp_testgzip; + if (wrk->handling != VCL_RET_DELIVER) + return (F_STP_NOTYET); + return (F_STP_FETCH); } /*-------------------------------------------------------------------- */ static enum fetch_step -vbf_stp_fetch(struct worker *wrk, struct busyobj *bo, struct req **reqp) +vbf_stp_fetch(struct worker *wrk, struct busyobj *bo) { - int i; struct http *hp, *hp2; - struct req *req; char *b; uint16_t nhttp; unsigned l; @@ -789,48 +822,10 @@ vbf_stp_fetch(struct worker *wrk, struct busyobj *bo, struct req **reqp) CHECK_OBJ_NOTNULL(wrk, WORKER_MAGIC); - AN(reqp); - req = *reqp; - CHECK_OBJ_NOTNULL(req, REQ_MAGIC); CHECK_OBJ_NOTNULL(bo, BUSYOBJ_MAGIC); - vbf_make_bereq(wrk, req, bo); - xxxassert (wrk->handling == VCL_RET_FETCH); - - HTTP_Setup(bo->beresp, bo->ws, bo->vsl, HTTP_Beresp); - - if (!bo->do_pass) { - AN(req); - req = NULL; - *reqp = NULL; - } - - i = vbf_fetch_hdr(wrk, bo, req); - /* - * If we recycle a backend connection, there is a finite chance - * that the backend closed it before we get a request to it. - * Do a single retry in that case. - */ - if (i == 1) { - VSC_C_main->backend_retry++; - i = vbf_fetch_hdr(wrk, bo, req); - } - - if (bo->do_pass) { - AN(req); - req = NULL; - *reqp = NULL; - } - AZ(req); - - if (i) { - wrk->handling = VCL_RET_ERROR; - bo->err_code = 503; - } else { - vbf_proc_resp(wrk, bo); - if (wrk->handling != VCL_RET_DELIVER) - VDI_CloseFd(&bo->vbc); - } + if (wrk->handling != VCL_RET_DELIVER) + VDI_CloseFd(&bo->vbc); if (wrk->handling != VCL_RET_DELIVER) { /* Clean up partial fetch */ @@ -1003,6 +998,12 @@ vbf_stp_abandon(struct worker *wrk, struct busyobj *bo) */ static enum fetch_step +vbf_stp_notyet(void) +{ + WRONG("Patience, grashopper, patience..."); +} + +static enum fetch_step vbf_stp_done(void) { WRONG("Just plain wrong"); @@ -1033,7 +1034,7 @@ vbf_fetch_thread(struct worker *wrk, void *priv) bo = vsh->bo; THR_SetBusyobj(bo); - bo->step = F_STP_FETCH; + bo->step = F_STP_MKBEREQ; while (bo->step != F_STP_DONE) { switch(bo->step) { diff --git a/bin/varnishtest/tests/b00015.vtc b/bin/varnishtest/tests/b00015.vtc index ff39f42..3626284 100644 --- a/bin/varnishtest/tests/b00015.vtc +++ b/bin/varnishtest/tests/b00015.vtc @@ -15,36 +15,51 @@ client c1 { expect resp.http.X-varnish == "1001" } -run +delay .1 + client c1 { txreq -url "/" rxresp expect resp.status == 503 - expect resp.http.X-varnish == "1005" + expect resp.http.X-varnish == "1004" } -run -# Then check that an cacheable error from the backend is +delay .1 + +# Then check that a cacheable error from the backend is + +varnish v1 -cliok "ban req.url ~ .*" server s1 { rxreq txresp -status 302 } -start -varnish v1 -vcl+backend { } +varnish v1 -vcl+backend { + sub vcl_backend_response { + set beresp.http.ttl = beresp.ttl; + set beresp.http.uncacheable = beresp.uncacheable; + } +} client c1 { txreq -url "/" rxresp expect resp.status == 302 - expect resp.http.X-varnish == "1009" + expect resp.http.X-varnish == "1007" } -run +delay .1 + client c1 { txreq -url "/" rxresp expect resp.status == 302 - expect resp.http.X-varnish == "1012 1010" + expect resp.http.X-varnish == "1010 1008" } -run +delay .1 + # Then check that a non-cacheable error from the backend can be server s1 { @@ -64,12 +79,16 @@ client c1 { txreq -url "/2" rxresp expect resp.status == 502 - expect resp.http.X-varnish == "1014" + expect resp.http.X-varnish == "1012" } -run +delay .1 + client c1 { txreq -url "/2" rxresp expect resp.status == 502 - expect resp.http.X-varnish == "1017 1015" + expect resp.http.X-varnish == "1015 1013" } -run + +delay .1 diff --git a/bin/varnishtest/tests/c00024.vtc b/bin/varnishtest/tests/c00024.vtc index 4b1bdac..31ebc8b 100644 --- a/bin/varnishtest/tests/c00024.vtc +++ b/bin/varnishtest/tests/c00024.vtc @@ -5,21 +5,10 @@ server s1 { txresp } -start -server s2 { -} -start - -varnish v1 -vcl { - backend bad { - .host = "${s2_addr}"; - .port = "${s2_port}"; - } - backend good { - .host = "${s1_addr}"; - .port = "${s1_port}"; - } +varnish v1 -vcl+backend { sub vcl_recv { - if (req.restarts > 0) { - set req.backend = good; + if (req.restarts == 0) { + return (error(701, "FOO")); } } sub vcl_error { diff --git a/include/tbl/steps.h b/include/tbl/steps.h index 32af092..5f9b3b2 100644 --- a/include/tbl/steps.h +++ b/include/tbl/steps.h @@ -50,8 +50,11 @@ REQ_STEP(error, ERROR, (wrk, req)) #endif #ifdef FETCH_STEP -FETCH_STEP(fetch, FETCH, (wrk, bo, reqp)) +FETCH_STEP(mkbereq, MKBEREQ, (wrk, bo, *reqp)) +FETCH_STEP(fetchhdr, FETCHHDR, (wrk, bo, reqp)) +FETCH_STEP(fetch, FETCH, (wrk, bo)) FETCH_STEP(abandon, ABANDON, (wrk, bo)) +FETCH_STEP(notyet, NOTYET, ()) FETCH_STEP(done, DONE, ()) #endif From phk at varnish-cache.org Mon Jun 17 09:33:16 2013 From: phk at varnish-cache.org (Poul-Henning Kamp) Date: Mon, 17 Jun 2013 11:33:16 +0200 Subject: [master] 1252ddf Minor polishing Message-ID: commit 1252ddf2bdcdf18f573a19bb1555e1ed835f5895 Author: Poul-Henning Kamp Date: Mon Jun 17 09:33:07 2013 +0000 Minor polishing diff --git a/bin/varnishd/cache/cache_fetch.c b/bin/varnishd/cache/cache_fetch.c index 4ef7991..c3a58b6 100644 --- a/bin/varnishd/cache/cache_fetch.c +++ b/bin/varnishd/cache/cache_fetch.c @@ -474,7 +474,7 @@ vbf_fetch_hdr(struct worker *wrk, struct busyobj *bo, struct req *req) */ static void -vbf_fetch_body(struct worker *wrk, void *priv) +vbf_fetch_body(struct worker *wrk, struct busyobj *bo) { int cls; struct storage *st; @@ -482,10 +482,9 @@ vbf_fetch_body(struct worker *wrk, void *priv) ssize_t cl; struct http_conn *htc; struct object *obj; - struct busyobj *bo; CHECK_OBJ_NOTNULL(wrk, WORKER_MAGIC); - CAST_OBJ_NOTNULL(bo, priv, BUSYOBJ_MAGIC); + CHECK_OBJ_NOTNULL(bo, BUSYOBJ_MAGIC); htc = &bo->htc; CHECK_OBJ_ORNULL(bo->vbc, VBC_MAGIC); obj = bo->fetch_obj; @@ -661,6 +660,7 @@ vbf_stp_mkbereq(struct worker *wrk, struct busyobj *bo, const struct req *req) http_PrintfHeader(bo->bereq, "X-Varnish: %u", bo->vsl->wid & VSL_IDENTMASK); + /* XXX: Missing ABANDON */ return (F_STP_FETCHHDR); } /*-------------------------------------------------------------------- @@ -801,9 +801,10 @@ vbf_stp_fetchhdr(struct worker *wrk, struct busyobj *bo, struct req **reqp) else if (bo->is_gzip) bo->vfp = &vfp_testgzip; - if (wrk->handling != VCL_RET_DELIVER) - return (F_STP_NOTYET); - return (F_STP_FETCH); + if (wrk->handling == VCL_RET_DELIVER) + return (F_STP_FETCH); + + return (F_STP_NOTYET); } /*-------------------------------------------------------------------- @@ -824,6 +825,8 @@ vbf_stp_fetch(struct worker *wrk, struct busyobj *bo) CHECK_OBJ_NOTNULL(wrk, WORKER_MAGIC); CHECK_OBJ_NOTNULL(bo, BUSYOBJ_MAGIC); + assert(wrk->handling == VCL_RET_DELIVER); +#if 0 if (wrk->handling != VCL_RET_DELIVER) VDI_CloseFd(&bo->vbc); @@ -853,6 +856,7 @@ vbf_stp_fetch(struct worker *wrk, struct busyobj *bo) WRONG("Illegal action in vcl_fetch{}"); } } +#endif if (bo->fetch_objcore->objhead == NULL) AN(bo->do_pass); From phk at varnish-cache.org Mon Jun 17 09:59:31 2013 From: phk at varnish-cache.org (Poul-Henning Kamp) Date: Mon, 17 Jun 2013 11:59:31 +0200 Subject: [master] 599f2e3 Split the HTTP1 specific parts of the backend fetch code into a separate source file. Message-ID: commit 599f2e3f9fcc3f89883680a9e278b4dc124c5584 Author: Poul-Henning Kamp Date: Mon Jun 17 09:59:07 2013 +0000 Split the HTTP1 specific parts of the backend fetch code into a separate source file. diff --git a/bin/varnishd/Makefile.am b/bin/varnishd/Makefile.am index de6b28a..0c9fa5d 100644 --- a/bin/varnishd/Makefile.am +++ b/bin/varnishd/Makefile.am @@ -27,6 +27,7 @@ varnishd_SOURCES = \ cache/cache_gzip.c \ cache/cache_hash.c \ cache/cache_http.c \ + cache/cache_http1_fetch.c \ cache/cache_http1_fsm.c \ cache/cache_http1_proto.c \ cache/cache_lck.c \ diff --git a/bin/varnishd/cache/cache.h b/bin/varnishd/cache/cache.h index 53a397f..644d15c 100644 --- a/bin/varnishd/cache/cache.h +++ b/bin/varnishd/cache/cache.h @@ -784,6 +784,10 @@ void VBO_DerefBusyObj(struct worker *wrk, struct busyobj **busyobj); void VBO_Free(struct busyobj **vbo); void VBO_extend(const struct busyobj *, ssize_t); +/* cache_http1_fetch.c [V1F] */ +int V1F_fetch_hdr(struct worker *wrk, struct busyobj *bo, struct req *req); +void V1F_fetch_body(struct worker *wrk, struct busyobj *bo); + /* cache_http1_fsm.c [HTTP1] */ typedef int (req_body_iter_f)(struct req *, void *priv, void *ptr, size_t); void HTTP1_Session(struct worker *, struct req *); diff --git a/bin/varnishd/cache/cache_fetch.c b/bin/varnishd/cache/cache_fetch.c index c3a58b6..431406e 100644 --- a/bin/varnishd/cache/cache_fetch.c +++ b/bin/varnishd/cache/cache_fetch.c @@ -41,8 +41,6 @@ #include "cache_backend.h" #include "vcli_priv.h" #include "vcl.h" -#include "vct.h" -#include "vtcp.h" #include "vtim.h" static unsigned fetchfrag; @@ -211,414 +209,6 @@ VBF_GetStorage(struct busyobj *bo, ssize_t sz) } /*-------------------------------------------------------------------- - * Convert a string to a size_t safely - */ - -static ssize_t -vbf_fetch_number(const char *nbr, int radix) -{ - uintmax_t cll; - ssize_t cl; - char *q; - - if (*nbr == '\0') - return (-1); - cll = strtoumax(nbr, &q, radix); - if (q == NULL || *q != '\0') - return (-1); - - cl = (ssize_t)cll; - if((uintmax_t)cl != cll) /* Protect against bogusly large values */ - return (-1); - return (cl); -} - -/*--------------------------------------------------------------------*/ - -static int -vbf_fetch_straight(struct busyobj *bo, struct http_conn *htc, ssize_t cl) -{ - int i; - - assert(htc->body_status == BS_LENGTH); - - if (cl < 0) { - return (VBF_Error(bo, "straight length field bogus")); - } else if (cl == 0) - return (0); - - i = bo->vfp->bytes(bo, htc, cl); - if (i <= 0) - return (VBF_Error(bo, "straight insufficient bytes")); - return (0); -} - -/*-------------------------------------------------------------------- - * Read a chunked HTTP object. - * - * XXX: Reading one byte at a time is pretty pessimal. - */ - -static int -vbf_fetch_chunked(struct busyobj *bo, struct http_conn *htc) -{ - int i; - char buf[20]; /* XXX: 20 is arbitrary */ - unsigned u; - ssize_t cl; - - assert(htc->body_status == BS_CHUNKED); - do { - /* Skip leading whitespace */ - do { - if (HTTP1_Read(htc, buf, 1) <= 0) - return (VBF_Error(bo, "chunked read err")); - } while (vct_islws(buf[0])); - - if (!vct_ishex(buf[0])) - return (VBF_Error(bo, "chunked header non-hex")); - - /* Collect hex digits, skipping leading zeros */ - for (u = 1; u < sizeof buf; u++) { - do { - if (HTTP1_Read(htc, buf + u, 1) <= 0) - return (VBF_Error(bo, - "chunked read err")); - } while (u == 1 && buf[0] == '0' && buf[u] == '0'); - if (!vct_ishex(buf[u])) - break; - } - - if (u >= sizeof buf) - return (VBF_Error(bo,"chunked header too long")); - - /* Skip trailing white space */ - while(vct_islws(buf[u]) && buf[u] != '\n') - if (HTTP1_Read(htc, buf + u, 1) <= 0) - return (VBF_Error(bo, "chunked read err")); - - if (buf[u] != '\n') - return (VBF_Error(bo,"chunked header no NL")); - - buf[u] = '\0'; - cl = vbf_fetch_number(buf, 16); - if (cl < 0) - return (VBF_Error(bo,"chunked header number syntax")); - - if (cl > 0 && bo->vfp->bytes(bo, htc, cl) <= 0) - return (VBF_Error(bo, "chunked read err")); - - i = HTTP1_Read(htc, buf, 1); - if (i <= 0) - return (VBF_Error(bo, "chunked read err")); - if (buf[0] == '\r' && HTTP1_Read( htc, buf, 1) <= 0) - return (VBF_Error(bo, "chunked read err")); - if (buf[0] != '\n') - return (VBF_Error(bo,"chunked tail no NL")); - } while (cl > 0); - return (0); -} - -/*--------------------------------------------------------------------*/ - -static void -vbf_fetch_eof(struct busyobj *bo, struct http_conn *htc) -{ - - assert(htc->body_status == BS_EOF); - if (bo->vfp->bytes(bo, htc, SSIZE_MAX) < 0) - (void)VBF_Error(bo,"eof socket fail"); -} - -/*-------------------------------------------------------------------- - * Pass the request body to the backend - */ - -static int __match_proto__(req_body_iter_f) -vbf_iter_req_body(struct req *req, void *priv, void *ptr, size_t l) -{ - struct worker *wrk; - - CHECK_OBJ_NOTNULL(req, REQ_MAGIC); - CAST_OBJ_NOTNULL(wrk, priv, WORKER_MAGIC); - - if (l > 0) { - (void)WRW_Write(wrk, ptr, l); - if (WRW_Flush(wrk)) - return (-1); - } - return (0); -} - -/*-------------------------------------------------------------------- - * Send request, and receive the HTTP protocol response, but not the - * response body. - * - * Return value: - * -1 failure, not retryable - * 0 success - * 1 failure which can be retried. - */ - -static int -vbf_fetch_hdr(struct worker *wrk, struct busyobj *bo, struct req *req) -{ - struct vbc *vc; - struct http *hp; - enum htc_status_e hs; - int retry = -1; - int i, first; - struct http_conn *htc; - - CHECK_OBJ_NOTNULL(wrk, WORKER_MAGIC); - CHECK_OBJ_ORNULL(req, REQ_MAGIC); - CHECK_OBJ_NOTNULL(bo, BUSYOBJ_MAGIC); - htc = &bo->htc; - - AN(bo->director); - - hp = bo->bereq; - - bo->vbc = VDI_GetFd(NULL, bo); - if (bo->vbc == NULL) { - VSLb(bo->vsl, SLT_FetchError, "no backend connection"); - return (-1); - } - vc = bo->vbc; - if (vc->recycled) - retry = 1; - - /* - * Now that we know our backend, we can set a default Host: - * header if one is necessary. This cannot be done in the VCL - * because the backend may be chosen by a director. - */ - if (!http_GetHdr(bo->bereq, H_Host, NULL)) - VDI_AddHostHeader(bo->bereq, vc); - - (void)VTCP_blocking(vc->fd); /* XXX: we should timeout instead */ - WRW_Reserve(wrk, &vc->fd, bo->vsl, bo->t_fetch); - (void)HTTP1_Write(wrk, hp, 0); /* XXX: stats ? */ - - /* Deal with any message-body the request might (still) have */ - i = 0; - - if (req != NULL) { - i = HTTP1_IterateReqBody(req, vbf_iter_req_body, wrk); - if (req->req_body_status == REQ_BODY_DONE) - retry = -1; - } - - if (WRW_FlushRelease(wrk) || i != 0) { - VSLb(bo->vsl, SLT_FetchError, "backend write error: %d (%s)", - errno, strerror(errno)); - VDI_CloseFd(&bo->vbc); - /* XXX: other cleanup ? */ - return (retry); - } - - /* XXX is this the right place? */ - VSC_C_main->backend_req++; - - /* Receive response */ - - HTTP1_Init(htc, bo->ws, vc->fd, vc->vsl, - cache_param->http_resp_size, - cache_param->http_resp_hdr_len); - - VTCP_set_read_timeout(vc->fd, vc->first_byte_timeout); - - first = 1; - do { - hs = HTTP1_Rx(htc); - if (hs == HTTP1_OVERFLOW) { - VSLb(bo->vsl, SLT_FetchError, - "http %sread error: overflow", - first ? "first " : ""); - VDI_CloseFd(&bo->vbc); - /* XXX: other cleanup ? */ - return (-1); - } - if (hs == HTTP1_ERROR_EOF) { - VSLb(bo->vsl, SLT_FetchError, "http %sread error: EOF", - first ? "first " : ""); - VDI_CloseFd(&bo->vbc); - /* XXX: other cleanup ? */ - return (retry); - } - if (first) { - retry = -1; - first = 0; - VTCP_set_read_timeout(vc->fd, - vc->between_bytes_timeout); - } - } while (hs != HTTP1_COMPLETE); - - hp = bo->beresp; - - if (HTTP1_DissectResponse(hp, htc)) { - VSLb(bo->vsl, SLT_FetchError, "http format error"); - VDI_CloseFd(&bo->vbc); - /* XXX: other cleanup ? */ - return (-1); - } - return (0); -} - -/*-------------------------------------------------------------------- - * This function is either called by the requesting thread OR by a - * dedicated body-fetch work-thread. - * - * We get passed the busyobj in the priv arg, and we inherit a - * refcount on it, which we must release, when done fetching. - */ - -static void -vbf_fetch_body(struct worker *wrk, struct busyobj *bo) -{ - int cls; - struct storage *st; - int mklen; - ssize_t cl; - struct http_conn *htc; - struct object *obj; - - CHECK_OBJ_NOTNULL(wrk, WORKER_MAGIC); - CHECK_OBJ_NOTNULL(bo, BUSYOBJ_MAGIC); - htc = &bo->htc; - CHECK_OBJ_ORNULL(bo->vbc, VBC_MAGIC); - obj = bo->fetch_obj; - CHECK_OBJ_NOTNULL(obj, OBJECT_MAGIC); - CHECK_OBJ_NOTNULL(obj->http, HTTP_MAGIC); - - assert(bo->state == BOS_INVALID); - - /* - * XXX: The busyobj needs a dstat, but it is not obvious which one - * XXX: it should be (own/borrowed). For now borrow the wrk's. - */ - AZ(bo->stats); - bo->stats = &wrk->stats; - - - if (bo->vfp == NULL) - bo->vfp = &vfp_nop; - - AN(bo->vfp); - AZ(bo->vgz_rx); - AZ(VTAILQ_FIRST(&obj->store)); - - bo->state = BOS_FETCHING; - - /* XXX: pick up estimate from objdr ? */ - cl = 0; - cls = bo->should_close; - switch (htc->body_status) { - case BS_NONE: - mklen = 0; - break; - case BS_ZERO: - mklen = 1; - break; - case BS_LENGTH: - cl = vbf_fetch_number(bo->h_content_length, 10); - - bo->vfp->begin(bo, cl); - if (bo->state == BOS_FETCHING && cl > 0) - cls |= vbf_fetch_straight(bo, htc, cl); - mklen = 1; - if (bo->vfp->end(bo)) - assert(bo->state == BOS_FAILED); - break; - case BS_CHUNKED: - bo->vfp->begin(bo, cl > 0 ? cl : 0); - if (bo->state == BOS_FETCHING) - cls |= vbf_fetch_chunked(bo, htc); - mklen = 1; - if (bo->vfp->end(bo)) - assert(bo->state == BOS_FAILED); - break; - case BS_EOF: - bo->vfp->begin(bo, cl > 0 ? cl : 0); - if (bo->state == BOS_FETCHING) - vbf_fetch_eof(bo, htc); - mklen = 1; - cls = 1; - if (bo->vfp->end(bo)) - assert(bo->state == BOS_FAILED); - break; - case BS_ERROR: - cls |= VBF_Error(bo, "error incompatible Transfer-Encoding"); - mklen = 0; - break; - default: - mklen = 0; - INCOMPL(); - } - AZ(bo->vgz_rx); - - /* - * We always call vfp_nop_end() to ditch or trim the last storage - * segment, to avoid having to replicate that code in all vfp's. - */ - AZ(vfp_nop_end(bo)); - - bo->vfp = NULL; - - VSLb(bo->vsl, SLT_Fetch_Body, "%u(%s) cls %d mklen %d", - htc->body_status, body_status_2str(htc->body_status), - cls, mklen); - - http_Teardown(bo->bereq); - http_Teardown(bo->beresp); - - if (bo->vbc != NULL) { - if (cls) - VDI_CloseFd(&bo->vbc); - else - VDI_RecycleFd(&bo->vbc); - } - - if (bo->state == BOS_FAILED) { - wrk->stats.fetch_failed++; - obj->len = 0; - EXP_Clr(&obj->exp); - EXP_Rearm(obj); - } else { - assert(bo->state == BOS_FETCHING); - - VSLb(bo->vsl, SLT_Length, "%zd", obj->len); - - { - /* Sanity check fetch methods accounting */ - ssize_t uu; - - uu = 0; - VTAILQ_FOREACH(st, &obj->store, list) - uu += st->len; - if (bo->do_stream) - /* Streaming might have started freeing stuff */ - assert(uu <= obj->len); - - else - assert(uu == obj->len); - } - - if (mklen > 0) { - http_Unset(obj->http, H_Content_Length); - http_PrintfHeader(obj->http, - "Content-Length: %zd", obj->len); - } - - /* XXX: Atomic assignment, needs volatile/membar ? */ - bo->state = BOS_FINISHED; - } - if (obj->objcore->objhead != NULL) - HSH_Complete(obj->objcore); - bo->stats = NULL; -} - -/*-------------------------------------------------------------------- * Copy req->bereq and run it by VCL::vcl_backend_fetch{} */ @@ -683,7 +273,7 @@ vbf_stp_fetchhdr(struct worker *wrk, struct busyobj *bo, struct req **reqp) if (!bo->do_pass) *reqp = NULL; - i = vbf_fetch_hdr(wrk, bo, *reqp); + i = V1F_fetch_hdr(wrk, bo, *reqp); /* * If we recycle a backend connection, there is a finite chance * that the backend closed it before we get a request to it. @@ -691,7 +281,7 @@ vbf_stp_fetchhdr(struct worker *wrk, struct busyobj *bo, struct req **reqp) */ if (i == 1) { VSC_C_main->backend_retry++; - i = vbf_fetch_hdr(wrk, bo, *reqp); + i = V1F_fetch_hdr(wrk, bo, *reqp); } if (bo->do_pass) @@ -967,7 +557,10 @@ vbf_stp_fetch(struct worker *wrk, struct busyobj *bo) HSH_Unbusy(&wrk->stats, obj->objcore); } - vbf_fetch_body(wrk, bo); + if (bo->vfp == NULL) + bo->vfp = &vfp_nop; + + V1F_fetch_body(wrk, bo); assert(bo->refcount >= 1); @@ -1110,6 +703,7 @@ static struct cli_proto debug_cmds[] = { { NULL } }; + /*-------------------------------------------------------------------- * */ diff --git a/bin/varnishd/cache/cache_http1_fetch.c b/bin/varnishd/cache/cache_http1_fetch.c new file mode 100644 index 0000000..ed82094 --- /dev/null +++ b/bin/varnishd/cache/cache_http1_fetch.c @@ -0,0 +1,450 @@ +/*- + * Copyright (c) 2006 Verdens Gang AS + * Copyright (c) 2006-2011 Varnish Software AS + * All rights reserved. + * + * Author: Poul-Henning Kamp + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND + * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE + * ARE DISCLAIMED. IN NO EVENT SHALL AUTHOR OR CONTRIBUTORS BE LIABLE + * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL + * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS + * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) + * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT + * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY + * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF + * SUCH DAMAGE. + */ + +#include "config.h" + +#include +#include +#include +#include + +#include "cache.h" + +#include "hash/hash_slinger.h" + +#include "cache_backend.h" +#include "vcli_priv.h" +#include "vct.h" +#include "vtcp.h" + +/*-------------------------------------------------------------------- + * Convert a string to a size_t safely + */ + +static ssize_t +vbf_fetch_number(const char *nbr, int radix) +{ + uintmax_t cll; + ssize_t cl; + char *q; + + if (*nbr == '\0') + return (-1); + cll = strtoumax(nbr, &q, radix); + if (q == NULL || *q != '\0') + return (-1); + + cl = (ssize_t)cll; + if((uintmax_t)cl != cll) /* Protect against bogusly large values */ + return (-1); + return (cl); +} + +/*--------------------------------------------------------------------*/ + +static int +vbf_fetch_straight(struct busyobj *bo, struct http_conn *htc, ssize_t cl) +{ + int i; + + assert(htc->body_status == BS_LENGTH); + + if (cl < 0) { + return (VBF_Error(bo, "straight length field bogus")); + } else if (cl == 0) + return (0); + + i = bo->vfp->bytes(bo, htc, cl); + if (i <= 0) + return (VBF_Error(bo, "straight insufficient bytes")); + return (0); +} + +/*-------------------------------------------------------------------- + * Read a chunked HTTP object. + * + * XXX: Reading one byte at a time is pretty pessimal. + */ + +static int +vbf_fetch_chunked(struct busyobj *bo, struct http_conn *htc) +{ + int i; + char buf[20]; /* XXX: 20 is arbitrary */ + unsigned u; + ssize_t cl; + + assert(htc->body_status == BS_CHUNKED); + do { + /* Skip leading whitespace */ + do { + if (HTTP1_Read(htc, buf, 1) <= 0) + return (VBF_Error(bo, "chunked read err")); + } while (vct_islws(buf[0])); + + if (!vct_ishex(buf[0])) + return (VBF_Error(bo, "chunked header non-hex")); + + /* Collect hex digits, skipping leading zeros */ + for (u = 1; u < sizeof buf; u++) { + do { + if (HTTP1_Read(htc, buf + u, 1) <= 0) + return (VBF_Error(bo, + "chunked read err")); + } while (u == 1 && buf[0] == '0' && buf[u] == '0'); + if (!vct_ishex(buf[u])) + break; + } + + if (u >= sizeof buf) + return (VBF_Error(bo,"chunked header too long")); + + /* Skip trailing white space */ + while(vct_islws(buf[u]) && buf[u] != '\n') + if (HTTP1_Read(htc, buf + u, 1) <= 0) + return (VBF_Error(bo, "chunked read err")); + + if (buf[u] != '\n') + return (VBF_Error(bo,"chunked header no NL")); + + buf[u] = '\0'; + cl = vbf_fetch_number(buf, 16); + if (cl < 0) + return (VBF_Error(bo,"chunked header number syntax")); + + if (cl > 0 && bo->vfp->bytes(bo, htc, cl) <= 0) + return (VBF_Error(bo, "chunked read err")); + + i = HTTP1_Read(htc, buf, 1); + if (i <= 0) + return (VBF_Error(bo, "chunked read err")); + if (buf[0] == '\r' && HTTP1_Read( htc, buf, 1) <= 0) + return (VBF_Error(bo, "chunked read err")); + if (buf[0] != '\n') + return (VBF_Error(bo,"chunked tail no NL")); + } while (cl > 0); + return (0); +} + +/*--------------------------------------------------------------------*/ + +static void +vbf_fetch_eof(struct busyobj *bo, struct http_conn *htc) +{ + + assert(htc->body_status == BS_EOF); + if (bo->vfp->bytes(bo, htc, SSIZE_MAX) < 0) + (void)VBF_Error(bo,"eof socket fail"); +} + +/*-------------------------------------------------------------------- + * Pass the request body to the backend + */ + +static int __match_proto__(req_body_iter_f) +vbf_iter_req_body(struct req *req, void *priv, void *ptr, size_t l) +{ + struct worker *wrk; + + CHECK_OBJ_NOTNULL(req, REQ_MAGIC); + CAST_OBJ_NOTNULL(wrk, priv, WORKER_MAGIC); + + if (l > 0) { + (void)WRW_Write(wrk, ptr, l); + if (WRW_Flush(wrk)) + return (-1); + } + return (0); +} + +/*-------------------------------------------------------------------- + * Send request, and receive the HTTP protocol response, but not the + * response body. + * + * Return value: + * -1 failure, not retryable + * 0 success + * 1 failure which can be retried. + */ + +int +V1F_fetch_hdr(struct worker *wrk, struct busyobj *bo, struct req *req) +{ + struct vbc *vc; + struct http *hp; + enum htc_status_e hs; + int retry = -1; + int i, first; + struct http_conn *htc; + + CHECK_OBJ_NOTNULL(wrk, WORKER_MAGIC); + CHECK_OBJ_ORNULL(req, REQ_MAGIC); + CHECK_OBJ_NOTNULL(bo, BUSYOBJ_MAGIC); + htc = &bo->htc; + + AN(bo->director); + + hp = bo->bereq; + + bo->vbc = VDI_GetFd(NULL, bo); + if (bo->vbc == NULL) { + VSLb(bo->vsl, SLT_FetchError, "no backend connection"); + return (-1); + } + vc = bo->vbc; + if (vc->recycled) + retry = 1; + + /* + * Now that we know our backend, we can set a default Host: + * header if one is necessary. This cannot be done in the VCL + * because the backend may be chosen by a director. + */ + if (!http_GetHdr(bo->bereq, H_Host, NULL)) + VDI_AddHostHeader(bo->bereq, vc); + + (void)VTCP_blocking(vc->fd); /* XXX: we should timeout instead */ + WRW_Reserve(wrk, &vc->fd, bo->vsl, bo->t_fetch); + (void)HTTP1_Write(wrk, hp, 0); /* XXX: stats ? */ + + /* Deal with any message-body the request might (still) have */ + i = 0; + + if (req != NULL) { + i = HTTP1_IterateReqBody(req, vbf_iter_req_body, wrk); + if (req->req_body_status == REQ_BODY_DONE) + retry = -1; + } + + if (WRW_FlushRelease(wrk) || i != 0) { + VSLb(bo->vsl, SLT_FetchError, "backend write error: %d (%s)", + errno, strerror(errno)); + VDI_CloseFd(&bo->vbc); + /* XXX: other cleanup ? */ + return (retry); + } + + /* XXX is this the right place? */ + VSC_C_main->backend_req++; + + /* Receive response */ + + HTTP1_Init(htc, bo->ws, vc->fd, vc->vsl, + cache_param->http_resp_size, + cache_param->http_resp_hdr_len); + + VTCP_set_read_timeout(vc->fd, vc->first_byte_timeout); + + first = 1; + do { + hs = HTTP1_Rx(htc); + if (hs == HTTP1_OVERFLOW) { + VSLb(bo->vsl, SLT_FetchError, + "http %sread error: overflow", + first ? "first " : ""); + VDI_CloseFd(&bo->vbc); + /* XXX: other cleanup ? */ + return (-1); + } + if (hs == HTTP1_ERROR_EOF) { + VSLb(bo->vsl, SLT_FetchError, "http %sread error: EOF", + first ? "first " : ""); + VDI_CloseFd(&bo->vbc); + /* XXX: other cleanup ? */ + return (retry); + } + if (first) { + retry = -1; + first = 0; + VTCP_set_read_timeout(vc->fd, + vc->between_bytes_timeout); + } + } while (hs != HTTP1_COMPLETE); + + hp = bo->beresp; + + if (HTTP1_DissectResponse(hp, htc)) { + VSLb(bo->vsl, SLT_FetchError, "http format error"); + VDI_CloseFd(&bo->vbc); + /* XXX: other cleanup ? */ + return (-1); + } + return (0); +} + +/*-------------------------------------------------------------------- + * This function is either called by the requesting thread OR by a + * dedicated body-fetch work-thread. + * + * We get passed the busyobj in the priv arg, and we inherit a + * refcount on it, which we must release, when done fetching. + */ + +void +V1F_fetch_body(struct worker *wrk, struct busyobj *bo) +{ + int cls; + struct storage *st; + int mklen; + ssize_t cl; + struct http_conn *htc; + struct object *obj; + + CHECK_OBJ_NOTNULL(wrk, WORKER_MAGIC); + CHECK_OBJ_NOTNULL(bo, BUSYOBJ_MAGIC); + htc = &bo->htc; + CHECK_OBJ_ORNULL(bo->vbc, VBC_MAGIC); + obj = bo->fetch_obj; + CHECK_OBJ_NOTNULL(obj, OBJECT_MAGIC); + CHECK_OBJ_NOTNULL(obj->http, HTTP_MAGIC); + + assert(bo->state == BOS_INVALID); + + /* + * XXX: The busyobj needs a dstat, but it is not obvious which one + * XXX: it should be (own/borrowed). For now borrow the wrk's. + */ + AZ(bo->stats); + bo->stats = &wrk->stats; + + AN(bo->vfp); + AZ(bo->vgz_rx); + AZ(VTAILQ_FIRST(&obj->store)); + + bo->state = BOS_FETCHING; + + /* XXX: pick up estimate from objdr ? */ + cl = 0; + cls = bo->should_close; + switch (htc->body_status) { + case BS_NONE: + mklen = 0; + break; + case BS_ZERO: + mklen = 1; + break; + case BS_LENGTH: + cl = vbf_fetch_number(bo->h_content_length, 10); + + bo->vfp->begin(bo, cl); + if (bo->state == BOS_FETCHING && cl > 0) + cls |= vbf_fetch_straight(bo, htc, cl); + mklen = 1; + if (bo->vfp->end(bo)) + assert(bo->state == BOS_FAILED); + break; + case BS_CHUNKED: + bo->vfp->begin(bo, cl > 0 ? cl : 0); + if (bo->state == BOS_FETCHING) + cls |= vbf_fetch_chunked(bo, htc); + mklen = 1; + if (bo->vfp->end(bo)) + assert(bo->state == BOS_FAILED); + break; + case BS_EOF: + bo->vfp->begin(bo, cl > 0 ? cl : 0); + if (bo->state == BOS_FETCHING) + vbf_fetch_eof(bo, htc); + mklen = 1; + cls = 1; + if (bo->vfp->end(bo)) + assert(bo->state == BOS_FAILED); + break; + case BS_ERROR: + cls |= VBF_Error(bo, "error incompatible Transfer-Encoding"); + mklen = 0; + break; + default: + mklen = 0; + INCOMPL(); + } + AZ(bo->vgz_rx); + +#if 0 + /* + * We always call vfp_nop_end() to ditch or trim the last storage + * segment, to avoid having to replicate that code in all vfp's. + */ + AZ(vfp_nop_end(bo)); +#endif + + bo->vfp = NULL; + + VSLb(bo->vsl, SLT_Fetch_Body, "%u(%s) cls %d mklen %d", + htc->body_status, body_status_2str(htc->body_status), + cls, mklen); + + http_Teardown(bo->bereq); + http_Teardown(bo->beresp); + + if (bo->vbc != NULL) { + if (cls) + VDI_CloseFd(&bo->vbc); + else + VDI_RecycleFd(&bo->vbc); + } + + if (bo->state == BOS_FAILED) { + wrk->stats.fetch_failed++; + obj->len = 0; + EXP_Clr(&obj->exp); + EXP_Rearm(obj); + } else { + assert(bo->state == BOS_FETCHING); + + VSLb(bo->vsl, SLT_Length, "%zd", obj->len); + + { + /* Sanity check fetch methods accounting */ + ssize_t uu; + + uu = 0; + VTAILQ_FOREACH(st, &obj->store, list) + uu += st->len; + if (bo->do_stream) + /* Streaming might have started freeing stuff */ + assert(uu <= obj->len); + + else + assert(uu == obj->len); + } + + if (mklen > 0) { + http_Unset(obj->http, H_Content_Length); + http_PrintfHeader(obj->http, + "Content-Length: %zd", obj->len); + } + + /* XXX: Atomic assignment, needs volatile/membar ? */ + bo->state = BOS_FINISHED; + } + if (obj->objcore->objhead != NULL) + HSH_Complete(obj->objcore); + bo->stats = NULL; +} From phk at varnish-cache.org Mon Jun 17 10:17:45 2013 From: phk at varnish-cache.org (Poul-Henning Kamp) Date: Mon, 17 Jun 2013 12:17:45 +0200 Subject: [master] 5c00b0e Split fetch processor stuff (VFP) from the state machine and be more consistent about VFP_ names. Message-ID: commit 5c00b0edc5b7eeb5b6d7bee3dea77cef55eac07f Author: Poul-Henning Kamp Date: Mon Jun 17 10:17:02 2013 +0000 Split fetch processor stuff (VFP) from the state machine and be more consistent about VFP_ names. diff --git a/bin/varnishd/Makefile.am b/bin/varnishd/Makefile.am index 0c9fa5d..b1f54ed 100644 --- a/bin/varnishd/Makefile.am +++ b/bin/varnishd/Makefile.am @@ -24,6 +24,7 @@ varnishd_SOURCES = \ cache/cache_esi_parse.c \ cache/cache_expire.c \ cache/cache_fetch.c \ + cache/cache_fetch_proc.c \ cache/cache_gzip.c \ cache/cache_hash.c \ cache/cache_http.c \ diff --git a/bin/varnishd/cache/cache.h b/bin/varnishd/cache/cache.h index 644d15c..c0de28c 100644 --- a/bin/varnishd/cache/cache.h +++ b/bin/varnishd/cache/cache.h @@ -826,10 +826,13 @@ void EXP_NukeLRU(struct worker *wrk, struct vsl_log *vsl, struct lru *lru); /* cache_fetch.c */ void VBF_Fetch(struct worker *wrk, struct req *req); -struct storage *VBF_GetStorage(struct busyobj *, ssize_t sz); -int VBF_Error(struct busyobj *, const char *error); -int VBF_Error2(struct busyobj *, const char *error, const char *more); -void VBF_Init(void); + +/* cache_fetch_proc.c */ +struct storage *VFP_GetStorage(struct busyobj *, ssize_t sz); +int VFP_Error2(struct busyobj *, const char *error, const char *more); +int VFP_Error(struct busyobj *, const char *error); +void VFP_Init(void); +extern struct vfp VFP_nop; /* cache_gzip.c */ struct vgz; diff --git a/bin/varnishd/cache/cache_esi_fetch.c b/bin/varnishd/cache/cache_esi_fetch.c index 95439b0..7da3608 100644 --- a/bin/varnishd/cache/cache_esi_fetch.c +++ b/bin/varnishd/cache/cache_esi_fetch.c @@ -93,7 +93,7 @@ vfp_esi_bytes_uu(struct busyobj *bo, const struct vef_priv *vef, CHECK_OBJ_NOTNULL(vef, VEF_MAGIC); while (bytes > 0) { - st = VBF_GetStorage(bo, 0); + st = VFP_GetStorage(bo, 0); if (st == NULL) return (-1); wl = vef_read(htc, @@ -380,7 +380,7 @@ vfp_esi_end(void *priv) retval = -1; if (bo->vgz_rx != NULL && VGZ_Destroy(&bo->vgz_rx) != VGZ_END) - retval = VBF_Error(bo, "Gunzip+ESI Failed at the very end"); + retval = VFP_Error(bo, "Gunzip+ESI Failed at the very end"); vsb = VEP_Finish(bo); @@ -395,7 +395,7 @@ vfp_esi_end(void *priv) VSB_data(vsb), l); bo->fetch_obj->esidata->len = l; } else { - retval = VBF_Error(bo, + retval = VFP_Error(bo, "Could not allocate storage for esidata"); } } @@ -408,7 +408,7 @@ vfp_esi_end(void *priv) if (vef->vgz != NULL) { VGZ_UpdateObj(vef->vgz, bo->fetch_obj); if (VGZ_Destroy(&vef->vgz) != VGZ_END) - retval = VBF_Error(bo, + retval = VFP_Error(bo, "ESI+Gzip Failed at the very end"); } if (vef->ibuf != NULL) diff --git a/bin/varnishd/cache/cache_fetch.c b/bin/varnishd/cache/cache_fetch.c index 431406e..875ddb6 100644 --- a/bin/varnishd/cache/cache_fetch.c +++ b/bin/varnishd/cache/cache_fetch.c @@ -43,171 +43,6 @@ #include "vcl.h" #include "vtim.h" -static unsigned fetchfrag; - -/*-------------------------------------------------------------------- - * We want to issue the first error we encounter on fetching and - * supress the rest. This function does that. - * - * Other code is allowed to look at busyobj->fetch_failed to bail out - * - * For convenience, always return -1 - */ - -int -VBF_Error2(struct busyobj *bo, const char *error, const char *more) -{ - - CHECK_OBJ_NOTNULL(bo, BUSYOBJ_MAGIC); - if (bo->state == BOS_FETCHING) { - if (more == NULL) - VSLb(bo->vsl, SLT_FetchError, "%s", error); - else - VSLb(bo->vsl, SLT_FetchError, "%s: %s", error, more); - } - bo->state = BOS_FAILED; - return (-1); -} - -int -VBF_Error(struct busyobj *bo, const char *error) -{ - return(VBF_Error2(bo, error, NULL)); -} - -/*-------------------------------------------------------------------- - * VFP_NOP - * - * This fetch-processor does nothing but store the object. - * It also documents the API - */ - -/*-------------------------------------------------------------------- - * VFP_BEGIN - * - * Called to set up stuff. - * - * 'estimate' is the estimate of the number of bytes we expect to receive, - * as seen on the socket, or zero if unknown. - */ -static void __match_proto__(vfp_begin_f) -vfp_nop_begin(void *priv, size_t estimate) -{ - struct busyobj *bo; - - CAST_OBJ_NOTNULL(bo, priv, BUSYOBJ_MAGIC); - - if (estimate > 0) - (void)VBF_GetStorage(bo, estimate); -} - -/*-------------------------------------------------------------------- - * VFP_BYTES - * - * Process (up to) 'bytes' from the socket. - * - * Return -1 on error, issue VBF_Error() - * will not be called again, once error happens. - * Return 0 on EOF on socket even if bytes not reached. - * Return 1 when 'bytes' have been processed. - */ - -static int __match_proto__(vfp_bytes_f) -vfp_nop_bytes(void *priv, struct http_conn *htc, ssize_t bytes) -{ - ssize_t l, wl; - struct storage *st; - struct busyobj *bo; - - CAST_OBJ_NOTNULL(bo, priv, BUSYOBJ_MAGIC); - - while (bytes > 0) { - st = VBF_GetStorage(bo, 0); - if (st == NULL) - return(-1); - l = st->space - st->len; - if (l > bytes) - l = bytes; - wl = HTTP1_Read(htc, st->ptr + st->len, l); - if (wl <= 0) - return (wl); - st->len += wl; - VBO_extend(bo, wl); - bytes -= wl; - } - return (1); -} - -/*-------------------------------------------------------------------- - * VFP_END - * - * Finish & cleanup - * - * Return -1 for error - * Return 0 for OK - */ - -static int __match_proto__(vfp_end_f) -vfp_nop_end(void *priv) -{ - struct storage *st; - struct busyobj *bo; - - CAST_OBJ_NOTNULL(bo, priv, BUSYOBJ_MAGIC); - st = VTAILQ_LAST(&bo->fetch_obj->store, storagehead); - if (st == NULL) - return (0); - - if (st->len == 0) { - VTAILQ_REMOVE(&bo->fetch_obj->store, st, list); - STV_free(st); - return (0); - } - if (st->len < st->space) - STV_trim(st, st->len, 1); - return (0); -} - -static struct vfp vfp_nop = { - .begin = vfp_nop_begin, - .bytes = vfp_nop_bytes, - .end = vfp_nop_end, -}; - -/*-------------------------------------------------------------------- - * Fetch Storage to put object into. - * - */ - -struct storage * -VBF_GetStorage(struct busyobj *bo, ssize_t sz) -{ - ssize_t l; - struct storage *st; - struct object *obj; - - CHECK_OBJ_NOTNULL(bo, BUSYOBJ_MAGIC); - obj = bo->fetch_obj; - CHECK_OBJ_NOTNULL(obj, OBJECT_MAGIC); - st = VTAILQ_LAST(&obj->store, storagehead); - if (st != NULL && st->len < st->space) - return (st); - - l = fetchfrag; - if (l == 0) - l = sz; - if (l == 0) - l = cache_param->fetch_chunksize; - st = STV_alloc(bo, l); - if (st == NULL) { - (void)VBF_Error(bo, "Could not get storage"); - return (NULL); - } - AZ(st->len); - VTAILQ_INSERT_TAIL(&obj->store, st, list); - return (st); -} - /*-------------------------------------------------------------------- * Copy req->bereq and run it by VCL::vcl_backend_fetch{} */ @@ -558,7 +393,7 @@ vbf_stp_fetch(struct worker *wrk, struct busyobj *bo) } if (bo->vfp == NULL) - bo->vfp = &vfp_nop; + bo->vfp = &VFP_nop; V1F_fetch_body(wrk, bo); @@ -684,33 +519,3 @@ VBF_Fetch(struct worker *wrk, struct req *req) (void)usleep(100000); } } - -/*-------------------------------------------------------------------- - * Debugging aids - */ - -static void -debug_fragfetch(struct cli *cli, const char * const *av, void *priv) -{ - (void)priv; - (void)cli; - fetchfrag = strtoul(av[2], NULL, 0); -} - -static struct cli_proto debug_cmds[] = { - { "debug.fragfetch", "debug.fragfetch", - "\tEnable fetch fragmentation\n", 1, 1, "d", debug_fragfetch }, - { NULL } -}; - - -/*-------------------------------------------------------------------- - * - */ - -void -VBF_Init(void) -{ - - CLI_AddFuncs(debug_cmds); -} diff --git a/bin/varnishd/cache/cache_fetch_proc.c b/bin/varnishd/cache/cache_fetch_proc.c new file mode 100644 index 0000000..8fd37f9 --- /dev/null +++ b/bin/varnishd/cache/cache_fetch_proc.c @@ -0,0 +1,236 @@ +/*- + * Copyright (c) 2006 Verdens Gang AS + * Copyright (c) 2006-2011 Varnish Software AS + * All rights reserved. + * + * Author: Poul-Henning Kamp + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND + * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE + * ARE DISCLAIMED. IN NO EVENT SHALL AUTHOR OR CONTRIBUTORS BE LIABLE + * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL + * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS + * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) + * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT + * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY + * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF + * SUCH DAMAGE. + */ + +#include "config.h" + +#include +#include +#include +#include + +#include "cache.h" + +#include "hash/hash_slinger.h" + +#include "cache_backend.h" +#include "vcli_priv.h" + +static unsigned fetchfrag; + +/*-------------------------------------------------------------------- + * We want to issue the first error we encounter on fetching and + * supress the rest. This function does that. + * + * Other code is allowed to look at busyobj->fetch_failed to bail out + * + * For convenience, always return -1 + */ + +int +VFP_Error2(struct busyobj *bo, const char *error, const char *more) +{ + + CHECK_OBJ_NOTNULL(bo, BUSYOBJ_MAGIC); + if (bo->state == BOS_FETCHING) { + if (more == NULL) + VSLb(bo->vsl, SLT_FetchError, "%s", error); + else + VSLb(bo->vsl, SLT_FetchError, "%s: %s", error, more); + } + bo->state = BOS_FAILED; + return (-1); +} + +int +VFP_Error(struct busyobj *bo, const char *error) +{ + return(VFP_Error2(bo, error, NULL)); +} + +/*-------------------------------------------------------------------- + * VFP_NOP + * + * This fetch-processor does nothing but store the object. + * It also documents the API + */ + +/*-------------------------------------------------------------------- + * VFP_BEGIN + * + * Called to set up stuff. + * + * 'estimate' is the estimate of the number of bytes we expect to receive, + * as seen on the socket, or zero if unknown. + */ +static void __match_proto__(vfp_begin_f) +vfp_nop_begin(void *priv, size_t estimate) +{ + struct busyobj *bo; + + CAST_OBJ_NOTNULL(bo, priv, BUSYOBJ_MAGIC); + + if (estimate > 0) + (void)VFP_GetStorage(bo, estimate); +} + +/*-------------------------------------------------------------------- + * VFP_BYTES + * + * Process (up to) 'bytes' from the socket. + * + * Return -1 on error, issue VFP_Error() + * will not be called again, once error happens. + * Return 0 on EOF on socket even if bytes not reached. + * Return 1 when 'bytes' have been processed. + */ + +static int __match_proto__(vfp_bytes_f) +vfp_nop_bytes(void *priv, struct http_conn *htc, ssize_t bytes) +{ + ssize_t l, wl; + struct storage *st; + struct busyobj *bo; + + CAST_OBJ_NOTNULL(bo, priv, BUSYOBJ_MAGIC); + + while (bytes > 0) { + st = VFP_GetStorage(bo, 0); + if (st == NULL) + return(-1); + l = st->space - st->len; + if (l > bytes) + l = bytes; + wl = HTTP1_Read(htc, st->ptr + st->len, l); + if (wl <= 0) + return (wl); + st->len += wl; + VBO_extend(bo, wl); + bytes -= wl; + } + return (1); +} + +/*-------------------------------------------------------------------- + * VFP_END + * + * Finish & cleanup + * + * Return -1 for error + * Return 0 for OK + */ + +static int __match_proto__(vfp_end_f) +vfp_nop_end(void *priv) +{ + struct storage *st; + struct busyobj *bo; + + CAST_OBJ_NOTNULL(bo, priv, BUSYOBJ_MAGIC); + st = VTAILQ_LAST(&bo->fetch_obj->store, storagehead); + if (st == NULL) + return (0); + + if (st->len == 0) { + VTAILQ_REMOVE(&bo->fetch_obj->store, st, list); + STV_free(st); + return (0); + } + if (st->len < st->space) + STV_trim(st, st->len, 1); + return (0); +} + +struct vfp VFP_nop = { + .begin = vfp_nop_begin, + .bytes = vfp_nop_bytes, + .end = vfp_nop_end, +}; + +/*-------------------------------------------------------------------- + * Fetch Storage to put object into. + * + */ + +struct storage * +VFP_GetStorage(struct busyobj *bo, ssize_t sz) +{ + ssize_t l; + struct storage *st; + struct object *obj; + + CHECK_OBJ_NOTNULL(bo, BUSYOBJ_MAGIC); + obj = bo->fetch_obj; + CHECK_OBJ_NOTNULL(obj, OBJECT_MAGIC); + st = VTAILQ_LAST(&obj->store, storagehead); + if (st != NULL && st->len < st->space) + return (st); + + l = fetchfrag; + if (l == 0) + l = sz; + if (l == 0) + l = cache_param->fetch_chunksize; + st = STV_alloc(bo, l); + if (st == NULL) { + (void)VFP_Error(bo, "Could not get storage"); + return (NULL); + } + AZ(st->len); + VTAILQ_INSERT_TAIL(&obj->store, st, list); + return (st); +} + +/*-------------------------------------------------------------------- + * Debugging aids + */ + +static void +debug_fragfetch(struct cli *cli, const char * const *av, void *priv) +{ + (void)priv; + (void)cli; + fetchfrag = strtoul(av[2], NULL, 0); +} + +static struct cli_proto debug_cmds[] = { + { "debug.fragfetch", "debug.fragfetch", + "\tEnable fetch fragmentation\n", 1, 1, "d", debug_fragfetch }, + { NULL } +}; + +/*-------------------------------------------------------------------- + * + */ + +void +VFP_Init(void) +{ + + CLI_AddFuncs(debug_cmds); +} diff --git a/bin/varnishd/cache/cache_gzip.c b/bin/varnishd/cache/cache_gzip.c index 3dceda2..f0374a7 100644 --- a/bin/varnishd/cache/cache_gzip.c +++ b/bin/varnishd/cache/cache_gzip.c @@ -206,7 +206,7 @@ VGZ_ObufStorage(struct busyobj *bo, struct vgz *vg) { struct storage *st; - st = VBF_GetStorage(bo, 0); + st = VFP_GetStorage(bo, 0); if (st == NULL) return (-1); @@ -482,7 +482,7 @@ vfp_gunzip_bytes(void *priv, struct http_conn *htc, ssize_t bytes) return(-1); i = VGZ_Gunzip(vg, &dp, &dl); if (i != VGZ_OK && i != VGZ_END) - return(VBF_Error(bo, "Gunzip data error")); + return(VFP_Error(bo, "Gunzip data error")); VBO_extend(bo, dl); } assert(i == Z_OK || i == Z_STREAM_END); @@ -504,7 +504,7 @@ vfp_gunzip_end(void *priv) return(0); } if (VGZ_Destroy(&vg) != VGZ_END) - return(VBF_Error(bo, "Gunzip error at the very end")); + return(VFP_Error(bo, "Gunzip error at the very end")); return (0); } @@ -592,7 +592,7 @@ vfp_gzip_end(void *priv) } while (i != Z_STREAM_END); VGZ_UpdateObj(vg, bo->fetch_obj); if (VGZ_Destroy(&vg) != VGZ_END) - return(VBF_Error(bo, "Gzip error at the very end")); + return(VFP_Error(bo, "Gzip error at the very end")); return (0); } @@ -637,7 +637,7 @@ vfp_testgzip_bytes(void *priv, struct http_conn *htc, ssize_t bytes) CHECK_OBJ_NOTNULL(vg, VGZ_MAGIC); AZ(vg->vz.avail_in); while (bytes > 0) { - st = VBF_GetStorage(bo, 0); + st = VFP_GetStorage(bo, 0); if (st == NULL) return(-1); l = st->space - st->len; @@ -655,9 +655,9 @@ vfp_testgzip_bytes(void *priv, struct http_conn *htc, ssize_t bytes) VGZ_Obuf(vg, vg->m_buf, vg->m_sz); i = VGZ_Gunzip(vg, &dp, &dl); if (i == VGZ_END && !VGZ_IbufEmpty(vg)) - return(VBF_Error(bo, "Junk after gzip data")); + return(VFP_Error(bo, "Junk after gzip data")); if (i != VGZ_OK && i != VGZ_END) - return(VBF_Error2(bo, + return(VFP_Error2(bo, "Invalid Gzip data", vg->vz.msg)); } } @@ -681,7 +681,7 @@ vfp_testgzip_end(void *priv) } VGZ_UpdateObj(vg, bo->fetch_obj); if (VGZ_Destroy(&vg) != VGZ_END) - return(VBF_Error(bo, "TestGunzip error at the very end")); + return(VFP_Error(bo, "TestGunzip error at the very end")); return (0); } diff --git a/bin/varnishd/cache/cache_http1_fetch.c b/bin/varnishd/cache/cache_http1_fetch.c index ed82094..0e808b3 100644 --- a/bin/varnishd/cache/cache_http1_fetch.c +++ b/bin/varnishd/cache/cache_http1_fetch.c @@ -76,13 +76,13 @@ vbf_fetch_straight(struct busyobj *bo, struct http_conn *htc, ssize_t cl) assert(htc->body_status == BS_LENGTH); if (cl < 0) { - return (VBF_Error(bo, "straight length field bogus")); + return (VFP_Error(bo, "straight length field bogus")); } else if (cl == 0) return (0); i = bo->vfp->bytes(bo, htc, cl); if (i <= 0) - return (VBF_Error(bo, "straight insufficient bytes")); + return (VFP_Error(bo, "straight insufficient bytes")); return (0); } @@ -105,17 +105,17 @@ vbf_fetch_chunked(struct busyobj *bo, struct http_conn *htc) /* Skip leading whitespace */ do { if (HTTP1_Read(htc, buf, 1) <= 0) - return (VBF_Error(bo, "chunked read err")); + return (VFP_Error(bo, "chunked read err")); } while (vct_islws(buf[0])); if (!vct_ishex(buf[0])) - return (VBF_Error(bo, "chunked header non-hex")); + return (VFP_Error(bo, "chunked header non-hex")); /* Collect hex digits, skipping leading zeros */ for (u = 1; u < sizeof buf; u++) { do { if (HTTP1_Read(htc, buf + u, 1) <= 0) - return (VBF_Error(bo, + return (VFP_Error(bo, "chunked read err")); } while (u == 1 && buf[0] == '0' && buf[u] == '0'); if (!vct_ishex(buf[u])) @@ -123,31 +123,31 @@ vbf_fetch_chunked(struct busyobj *bo, struct http_conn *htc) } if (u >= sizeof buf) - return (VBF_Error(bo,"chunked header too long")); + return (VFP_Error(bo,"chunked header too long")); /* Skip trailing white space */ while(vct_islws(buf[u]) && buf[u] != '\n') if (HTTP1_Read(htc, buf + u, 1) <= 0) - return (VBF_Error(bo, "chunked read err")); + return (VFP_Error(bo, "chunked read err")); if (buf[u] != '\n') - return (VBF_Error(bo,"chunked header no NL")); + return (VFP_Error(bo,"chunked header no NL")); buf[u] = '\0'; cl = vbf_fetch_number(buf, 16); if (cl < 0) - return (VBF_Error(bo,"chunked header number syntax")); + return (VFP_Error(bo,"chunked header number syntax")); if (cl > 0 && bo->vfp->bytes(bo, htc, cl) <= 0) - return (VBF_Error(bo, "chunked read err")); + return (VFP_Error(bo, "chunked read err")); i = HTTP1_Read(htc, buf, 1); if (i <= 0) - return (VBF_Error(bo, "chunked read err")); + return (VFP_Error(bo, "chunked read err")); if (buf[0] == '\r' && HTTP1_Read( htc, buf, 1) <= 0) - return (VBF_Error(bo, "chunked read err")); + return (VFP_Error(bo, "chunked read err")); if (buf[0] != '\n') - return (VBF_Error(bo,"chunked tail no NL")); + return (VFP_Error(bo,"chunked tail no NL")); } while (cl > 0); return (0); } @@ -160,7 +160,7 @@ vbf_fetch_eof(struct busyobj *bo, struct http_conn *htc) assert(htc->body_status == BS_EOF); if (bo->vfp->bytes(bo, htc, SSIZE_MAX) < 0) - (void)VBF_Error(bo,"eof socket fail"); + (void)VFP_Error(bo,"eof socket fail"); } /*-------------------------------------------------------------------- @@ -377,7 +377,7 @@ V1F_fetch_body(struct worker *wrk, struct busyobj *bo) assert(bo->state == BOS_FAILED); break; case BS_ERROR: - cls |= VBF_Error(bo, "error incompatible Transfer-Encoding"); + cls |= VFP_Error(bo, "error incompatible Transfer-Encoding"); mklen = 0; break; default: diff --git a/bin/varnishd/cache/cache_main.c b/bin/varnishd/cache/cache_main.c index bb17c0b..2749338 100644 --- a/bin/varnishd/cache/cache_main.c +++ b/bin/varnishd/cache/cache_main.c @@ -209,7 +209,7 @@ child_main(void) WAIT_Init(); PAN_Init(); CLI_Init(); - VBF_Init(); + VFP_Init(); VCL_Init(); From lkarsten at varnish-cache.org Mon Jun 17 13:56:10 2013 From: lkarsten at varnish-cache.org (Lasse Karstensen) Date: Mon, 17 Jun 2013 15:56:10 +0200 Subject: [master] acd6f64 Document changes from 3.0.3 to 3.0.4-rc1 Message-ID: commit acd6f645b0ab4648183f15100f079edbc50a4f41 Author: Tollef Fog Heen Date: Fri May 3 14:49:57 2013 +0200 Document changes from 3.0.3 to 3.0.4-rc1 diff --git a/doc/changes.rst b/doc/changes.rst index 87a7648..2342cd0 100644 --- a/doc/changes.rst +++ b/doc/changes.rst @@ -1,3 +1,61 @@ +================================ +Changes from 3.0.3 to 3.0.4 rc 1 +================================ + +varnishd +-------- + +- Fix error handling when uncompressing fetched objects for ESI + processing. `Bug #1184` +- Be clearer about which timeout was reached in logs. +- Correctly decrement n_waitinglist counter. `Bug #1261` +- Turn off Nagle/set TCP_NODELAY. +- Avoid panic on malformed Vary headers. `Bug #1275` +- Increase the maximum length of backend names. `Bug #1224` +- Add support for banning on http.status. `Bug #1076` +- Make hit-for-pass correctly prefer the transient storage. + +.. _bug #1076: http://varnish-cache.org/trac/ticket/1076 +.. _bug #1184: http://varnish-cache.org/trac/ticket/1184 +.. _bug #1224: http://varnish-cache.org/trac/ticket/1224 +.. _bug #1261: http://varnish-cache.org/trac/ticket/1261 +.. _bug #1275: http://varnish-cache.org/trac/ticket/1275 + + +varnishlog +---------- + +- If -m, but neither -b or -c is given, assume both. This filters out + a lot of noise when -m is used to filter. `Bug #1071` + +.. _bug #1071: http://varnish-cache.org/trac/ticket/1071 + +varnishadm +---------- + +- Improve tab completion and require libedit/readline to build. + +varnishtop +---------- + +- Reopen log file if Varnish is restarted. + +varnishncsa +----------- + +- Handle file descriptors above 64k (by ignoring them). Prevents a + crash in some cases with corrupted shared memory logs. +- Add %D and %T support for more timing information. + +Other +----- + +- Documentation updates. +- Fixes for OSX +- Disable PCRE JIT-er, since it's broken in some PCRE versions, at + least on i386. +- Make libvarnish prefer exact hits when looking for VSL tags. + =========================== Changes from 3.0.2 to 3.0.3 =========================== From lkarsten at varnish-cache.org Mon Jun 17 13:56:10 2013 From: lkarsten at varnish-cache.org (Lasse Karstensen) Date: Mon, 17 Jun 2013 15:56:10 +0200 Subject: [master] 51ea9e6 Document changes Message-ID: commit 51ea9e6a03afda446cb9639ac4ed334dd468f97b Author: Tollef Fog Heen Date: Thu Jun 13 10:51:18 2013 +0200 Document changes diff --git a/doc/changes.rst b/doc/changes.rst index 2342cd0..b16afc6 100644 --- a/doc/changes.rst +++ b/doc/changes.rst @@ -1,4 +1,20 @@ ================================ +Changes from 3.0.4 rc 1 to 3.0.4 +================================ + +varnishd +-------- + +- Set the waiter pipe as non-blocking and record overflows. `Bug + #1285` +- Fix up a bug in the ACL compile code that could lead to false + negatives. CVE-2013-4090. `Bug #1312` +- Return an error if the client sends multiple Host headers. + +.. _bug #1285: http://varnish-cache.org/trac/ticket/1285 +.. _bug #1312: http://varnish-cache.org/trac/ticket/1312 + +================================ Changes from 3.0.3 to 3.0.4 rc 1 ================================ From phk at varnish-cache.org Wed Jun 19 10:10:47 2013 From: phk at varnish-cache.org (Poul-Henning Kamp) Date: Wed, 19 Jun 2013 12:10:47 +0200 Subject: [master] bed0195 Make vcl_backend_fetch{ return(abandon); } work. Message-ID: commit bed0195729eed00b6ed964978c7a3317257d1db8 Author: Poul-Henning Kamp Date: Wed Jun 19 10:10:22 2013 +0000 Make vcl_backend_fetch{ return(abandon); } work. diff --git a/bin/varnishd/cache/cache_fetch.c b/bin/varnishd/cache/cache_fetch.c index 875ddb6..b51cabb 100644 --- a/bin/varnishd/cache/cache_fetch.c +++ b/bin/varnishd/cache/cache_fetch.c @@ -38,12 +38,22 @@ #include "hash/hash_slinger.h" -#include "cache_backend.h" -#include "vcli_priv.h" #include "vcl.h" #include "vtim.h" /*-------------------------------------------------------------------- + */ + +static void +vbf_release_req(struct req ***reqpp) +{ + if (*reqpp != NULL) { + **reqpp = NULL; + *reqpp = NULL; + } +} + +/*-------------------------------------------------------------------- * Copy req->bereq and run it by VCL::vcl_backend_fetch{} */ @@ -86,19 +96,22 @@ vbf_stp_mkbereq(struct worker *wrk, struct busyobj *bo, const struct req *req) http_PrintfHeader(bo->bereq, "X-Varnish: %u", bo->vsl->wid & VSL_IDENTMASK); /* XXX: Missing ABANDON */ + if (wrk->handling == VCL_RET_ABANDON) { + return (F_STP_ABANDON); + } return (F_STP_FETCHHDR); } + /*-------------------------------------------------------------------- */ static enum fetch_step -vbf_stp_fetchhdr(struct worker *wrk, struct busyobj *bo, struct req **reqp) +vbf_stp_fetchhdr(struct worker *wrk, struct busyobj *bo, struct req ***reqpp) { int i; CHECK_OBJ_NOTNULL(wrk, WORKER_MAGIC); - AN(reqp); - CHECK_OBJ_NOTNULL((*reqp), REQ_MAGIC); + AN(reqpp); CHECK_OBJ_NOTNULL(bo, BUSYOBJ_MAGIC); xxxassert (wrk->handling == VCL_RET_FETCH); @@ -106,9 +119,9 @@ vbf_stp_fetchhdr(struct worker *wrk, struct busyobj *bo, struct req **reqp) HTTP_Setup(bo->beresp, bo->ws, bo->vsl, HTTP_Beresp); if (!bo->do_pass) - *reqp = NULL; + vbf_release_req(reqpp); - i = V1F_fetch_hdr(wrk, bo, *reqp); + i = V1F_fetch_hdr(wrk, bo, *reqpp ? **reqpp : NULL); /* * If we recycle a backend connection, there is a finite chance * that the backend closed it before we get a request to it. @@ -116,11 +129,11 @@ vbf_stp_fetchhdr(struct worker *wrk, struct busyobj *bo, struct req **reqp) */ if (i == 1) { VSC_C_main->backend_retry++; - i = V1F_fetch_hdr(wrk, bo, *reqp); + i = V1F_fetch_hdr(wrk, bo, *reqpp ? **reqpp : NULL); } if (bo->do_pass) - *reqp = NULL; + vbf_release_req(reqpp); if (i) { AZ(bo->vbc); @@ -251,6 +264,7 @@ vbf_stp_fetch(struct worker *wrk, struct busyobj *bo) CHECK_OBJ_NOTNULL(bo, BUSYOBJ_MAGIC); assert(wrk->handling == VCL_RET_DELIVER); + #if 0 if (wrk->handling != VCL_RET_DELIVER) VDI_CloseFd(&bo->vbc); @@ -416,13 +430,15 @@ vbf_stp_fetch(struct worker *wrk, struct busyobj *bo) */ static enum fetch_step -vbf_stp_abandon(struct worker *wrk, struct busyobj *bo) +vbf_stp_abandon(struct worker *wrk, struct busyobj *bo, struct req ***reqp) { CHECK_OBJ_NOTNULL(wrk, WORKER_MAGIC); CHECK_OBJ_NOTNULL(bo, BUSYOBJ_MAGIC); + AN(reqp); bo->state = BOS_FAILED; VBO_DerefBusyObj(wrk, &bo); // XXX ? + vbf_release_req(reqp); return (F_STP_DONE); } @@ -435,6 +451,9 @@ vbf_stp_notyet(void) WRONG("Patience, grashopper, patience..."); } +/*-------------------------------------------------------------------- + */ + static enum fetch_step vbf_stp_done(void) { @@ -451,6 +470,21 @@ struct vbf_secret_handshake { struct req **reqp; }; +static const char * +vbf_step_name(enum fetch_step stp) +{ + switch (stp) { +#define FETCH_STEP(l, U, arg) \ + case F_STP_##U: \ + return (#U); +#include "tbl/steps.h" +#undef FETCH_STEP + default: + return ("F-step ?"); + } +} + + static void vbf_fetch_thread(struct worker *wrk, void *priv) { @@ -473,6 +507,8 @@ vbf_fetch_thread(struct worker *wrk, void *priv) #define FETCH_STEP(l, U, arg) \ case F_STP_##U: \ bo->step = vbf_stp_##l arg; \ + VSLb(bo->vsl, SLT_Debug, \ + "%s -> %s", #l, vbf_step_name(bo->step)); \ break; #include "tbl/steps.h" #undef FETCH_STEP diff --git a/bin/varnishd/cache/cache_http1_fetch.c b/bin/varnishd/cache/cache_http1_fetch.c index 0e808b3..a6ed005 100644 --- a/bin/varnishd/cache/cache_http1_fetch.c +++ b/bin/varnishd/cache/cache_http1_fetch.c @@ -409,6 +409,7 @@ V1F_fetch_body(struct worker *wrk, struct busyobj *bo) else VDI_RecycleFd(&bo->vbc); } + AZ(bo->vbc); if (bo->state == BOS_FAILED) { wrk->stats.fetch_failed++; diff --git a/bin/varnishtest/tests/b00038.vtc b/bin/varnishtest/tests/b00038.vtc new file mode 100644 index 0000000..963dc48 --- /dev/null +++ b/bin/varnishtest/tests/b00038.vtc @@ -0,0 +1,18 @@ +varnishtest "vcl_backend_fetch abandon" + +server s1 { + rxreq + txresp +} -start + +varnish v1 -vcl+backend { + sub vcl_backend_fetch { + return (abandon); + } +} -start + +client c1 { + txreq + rxresp + expect resp.status == 503 +} -run diff --git a/include/tbl/steps.h b/include/tbl/steps.h index 5f9b3b2..b0589ff 100644 --- a/include/tbl/steps.h +++ b/include/tbl/steps.h @@ -51,9 +51,9 @@ REQ_STEP(error, ERROR, (wrk, req)) #ifdef FETCH_STEP FETCH_STEP(mkbereq, MKBEREQ, (wrk, bo, *reqp)) -FETCH_STEP(fetchhdr, FETCHHDR, (wrk, bo, reqp)) +FETCH_STEP(fetchhdr, FETCHHDR, (wrk, bo, &reqp)) FETCH_STEP(fetch, FETCH, (wrk, bo)) -FETCH_STEP(abandon, ABANDON, (wrk, bo)) +FETCH_STEP(abandon, ABANDON, (wrk, bo, &reqp)) FETCH_STEP(notyet, NOTYET, ()) FETCH_STEP(done, DONE, ()) #endif diff --git a/lib/libvcl/generate.py b/lib/libvcl/generate.py index c4a8a49..fd2573d 100755 --- a/lib/libvcl/generate.py +++ b/lib/libvcl/generate.py @@ -85,7 +85,7 @@ returns =( ('purge', "C", ('error', 'fetch',)), ('miss', "C", ('error', 'restart', 'pass', 'fetch',)), ('lookup', "C", ('error', 'restart', 'pass', 'deliver',)), - ('backend_fetch', "B", ('fetch', 'error')), + ('backend_fetch', "B", ('fetch', 'abandon')), ('backend_response', "B", ('deliver', 'restart', 'error')), ('deliver', "C", ('restart', 'deliver',)), ('error', "C", ('restart', 'deliver',)), From phk at varnish-cache.org Wed Jun 19 12:57:05 2013 From: phk at varnish-cache.org (Poul-Henning Kamp) Date: Wed, 19 Jun 2013 14:57:05 +0200 Subject: [master] 5b6c26c Silence a GCC warning Message-ID: commit 5b6c26c9f928b2acc1a42bd246ac8b16b91df70a Author: Poul-Henning Kamp Date: Wed Jun 19 12:56:51 2013 +0000 Silence a GCC warning diff --git a/bin/varnishd/cache/cache_fetch.c b/bin/varnishd/cache/cache_fetch.c index b51cabb..6bba23e 100644 --- a/bin/varnishd/cache/cache_fetch.c +++ b/bin/varnishd/cache/cache_fetch.c @@ -449,6 +449,7 @@ static enum fetch_step vbf_stp_notyet(void) { WRONG("Patience, grashopper, patience..."); + NEEDLESS_RETURN(0); } /*-------------------------------------------------------------------- @@ -458,6 +459,7 @@ static enum fetch_step vbf_stp_done(void) { WRONG("Just plain wrong"); + NEEDLESS_RETURN(0); } /*-------------------------------------------------------------------- From phk at varnish-cache.org Tue Jun 25 08:45:56 2013 From: phk at varnish-cache.org (Poul-Henning Kamp) Date: Tue, 25 Jun 2013 10:45:56 +0200 Subject: [master] cd5857e Polish Message-ID: commit cd5857e26a6146a8558905763e3f203f2d37785a Author: Poul-Henning Kamp Date: Tue Jun 25 08:45:43 2013 +0000 Polish diff --git a/bin/varnishd/cache/cache_fetch.c b/bin/varnishd/cache/cache_fetch.c index 6bba23e..a557c23 100644 --- a/bin/varnishd/cache/cache_fetch.c +++ b/bin/varnishd/cache/cache_fetch.c @@ -95,10 +95,9 @@ vbf_stp_mkbereq(struct worker *wrk, struct busyobj *bo, const struct req *req) http_PrintfHeader(bo->bereq, "X-Varnish: %u", bo->vsl->wid & VSL_IDENTMASK); - /* XXX: Missing ABANDON */ - if (wrk->handling == VCL_RET_ABANDON) { + if (wrk->handling == VCL_RET_ABANDON) return (F_STP_ABANDON); - } + assert (wrk->handling == VCL_RET_FETCH); return (F_STP_FETCHHDR); } @@ -182,6 +181,31 @@ vbf_stp_fetchhdr(struct worker *wrk, struct busyobj *bo, struct req ***reqpp) VCL_backend_response_method(bo->vcl, wrk, NULL, bo, bo->beresp->ws); bo->do_pass |= i; + if (wrk->handling == VCL_RET_DELIVER) + return (F_STP_FETCH); + + return (F_STP_NOTYET); +} + +/*-------------------------------------------------------------------- + */ + +static enum fetch_step +vbf_stp_fetch(struct worker *wrk, struct busyobj *bo) +{ + struct http *hp, *hp2; + char *b; + uint16_t nhttp; + unsigned l; + struct vsb *vary = NULL; + int varyl = 0; + struct object *obj; + + + CHECK_OBJ_NOTNULL(wrk, WORKER_MAGIC); + CHECK_OBJ_NOTNULL(bo, BUSYOBJ_MAGIC); + + assert(wrk->handling == VCL_RET_DELIVER); if (bo->do_pass) bo->fetch_objcore->flags |= OC_F_PASS; @@ -239,31 +263,6 @@ vbf_stp_fetchhdr(struct worker *wrk, struct busyobj *bo, struct req ***reqpp) else if (bo->is_gzip) bo->vfp = &vfp_testgzip; - if (wrk->handling == VCL_RET_DELIVER) - return (F_STP_FETCH); - - return (F_STP_NOTYET); -} - -/*-------------------------------------------------------------------- - */ - -static enum fetch_step -vbf_stp_fetch(struct worker *wrk, struct busyobj *bo) -{ - struct http *hp, *hp2; - char *b; - uint16_t nhttp; - unsigned l; - struct vsb *vary = NULL; - int varyl = 0; - struct object *obj; - - - CHECK_OBJ_NOTNULL(wrk, WORKER_MAGIC); - CHECK_OBJ_NOTNULL(bo, BUSYOBJ_MAGIC); - - assert(wrk->handling == VCL_RET_DELIVER); #if 0 if (wrk->handling != VCL_RET_DELIVER) @@ -449,7 +448,7 @@ static enum fetch_step vbf_stp_notyet(void) { WRONG("Patience, grashopper, patience..."); - NEEDLESS_RETURN(0); + NEEDLESS_RETURN(F_STP_NOTYET); } /*-------------------------------------------------------------------- @@ -459,7 +458,7 @@ static enum fetch_step vbf_stp_done(void) { WRONG("Just plain wrong"); - NEEDLESS_RETURN(0); + NEEDLESS_RETURN(F_STP_NOTYET); } /*-------------------------------------------------------------------- From martin at varnish-cache.org Tue Jun 25 10:59:00 2013 From: martin at varnish-cache.org (Martin Blix Grydeland) Date: Tue, 25 Jun 2013 12:59:00 +0200 Subject: [master] 1f308d3 Format Batch VSL records in varnishtest VSL output, to make it clear what type of log records they are. Message-ID: commit 1f308d3b128acfd77b90fa60481e2266010c44c9 Author: Martin Blix Grydeland Date: Tue Jun 25 12:58:17 2013 +0200 Format Batch VSL records in varnishtest VSL output, to make it clear what type of log records they are. diff --git a/bin/varnishtest/vtc_varnish.c b/bin/varnishtest/vtc_varnish.c index 55d2747..d303943 100644 --- a/bin/varnishtest/vtc_varnish.c +++ b/bin/varnishtest/vtc_varnish.c @@ -186,7 +186,7 @@ varnishlog_thread(void *priv) enum VSL_tag_e tag; uint32_t vxid; unsigned len; - const char *data; + const char *tagname, *data; int type, i; CAST_OBJ_NOTNULL(v, priv, VARNISH_MAGIC); @@ -229,13 +229,19 @@ varnishlog_thread(void *priv) tag = VSL_TAG(c->rec.ptr); vxid = VSL_ID(c->rec.ptr); - len = VSL_LEN(c->rec.ptr); + if (tag == SLT__Batch) { + tagname = "Batch"; + len = 0; + } else { + tagname = VSL_tags[tag]; + len = VSL_LEN(c->rec.ptr); + } type = VSL_CLIENT(c->rec.ptr) ? 'c' : VSL_BACKEND(c->rec.ptr) ? 'b' : '-'; data = VSL_CDATA(c->rec.ptr); v->vsl_tag_count[tag]++; - vtc_log(v->vl, 4, "vsl| %10u %-15s %c %.*s", vxid, - VSL_tags[tag], type, (int)len, data); + vtc_log(v->vl, 4, "vsl| %10u %-15s %c %.*s", vxid, tagname, + type, (int)len, data); } if (c) From phk at varnish-cache.org Tue Jun 25 12:09:51 2013 From: phk at varnish-cache.org (Poul-Henning Kamp) Date: Tue, 25 Jun 2013 14:09:51 +0200 Subject: [master] 08fad22 Add bereq.retries variable Message-ID: commit 08fad2250aad122ead575f0fecbf5fd60c453a92 Author: Poul-Henning Kamp Date: Tue Jun 25 12:09:41 2013 +0000 Add bereq.retries variable diff --git a/bin/varnishd/cache/cache.h b/bin/varnishd/cache/cache.h index c0de28c..fb67dee 100644 --- a/bin/varnishd/cache/cache.h +++ b/bin/varnishd/cache/cache.h @@ -507,6 +507,7 @@ struct busyobj { * All fields from refcount and down are zeroed when the busyobj * is recycled. */ + int retries; unsigned refcount; double t_fetch; uint16_t err_code; diff --git a/bin/varnishd/cache/cache_vrt_var.c b/bin/varnishd/cache/cache_vrt_var.c index 7e9f133..0491b30 100644 --- a/bin/varnishd/cache/cache_vrt_var.c +++ b/bin/varnishd/cache/cache_vrt_var.c @@ -383,6 +383,15 @@ VRT_r_req_restarts(const struct vrt_ctx *ctx) return (ctx->req->restarts); } +long +VRT_r_bereq_retries(const struct vrt_ctx *ctx) +{ + + CHECK_OBJ_NOTNULL(ctx, VRT_CTX_MAGIC); + CHECK_OBJ_NOTNULL(ctx->bo, BUSYOBJ_MAGIC); + return (ctx->bo->retries); +} + /*-------------------------------------------------------------------- * NB: TTL is relative to when object was created, whereas grace and * keep are relative to ttl. diff --git a/lib/libvcl/generate.py b/lib/libvcl/generate.py index fd2573d..d358351 100755 --- a/lib/libvcl/generate.py +++ b/lib/libvcl/generate.py @@ -217,6 +217,11 @@ sp_variables = ( ( 'recv',), ( 'recv',), ), + ('bereq.retries', + 'INT', + ( 'backend',), + ( ), + ), ('bereq.backend', 'BACKEND', ( 'backend', ), From phk at varnish-cache.org Tue Jun 25 13:09:23 2013 From: phk at varnish-cache.org (Poul-Henning Kamp) Date: Tue, 25 Jun 2013 15:09:23 +0200 Subject: [master] df2bced Add the pristine copy of the bereq, which we will need for retrying fetches. Message-ID: commit df2bced191f7aa6f4434299dcaac10a3271bfeed Author: Poul-Henning Kamp Date: Tue Jun 25 13:08:47 2013 +0000 Add the pristine copy of the bereq, which we will need for retrying fetches. diff --git a/bin/varnishd/cache/cache.h b/bin/varnishd/cache/cache.h index fb67dee..f08feeb 100644 --- a/bin/varnishd/cache/cache.h +++ b/bin/varnishd/cache/cache.h @@ -523,6 +523,7 @@ struct busyobj { struct ws ws[1]; struct vbc *vbc; + struct http *bereq0; struct http *bereq; struct http *beresp; struct objcore *fetch_objcore; diff --git a/bin/varnishd/cache/cache_busyobj.c b/bin/varnishd/cache/cache_busyobj.c index 47e61b3..a23c791 100644 --- a/bin/varnishd/cache/cache_busyobj.c +++ b/bin/varnishd/cache/cache_busyobj.c @@ -116,6 +116,11 @@ VBO_GetBusyObj(struct worker *wrk, struct req *req) nhttp = (uint16_t)cache_param->http_max_hdr; sz = HTTP_estimate(nhttp); + bo->bereq0 = HTTP_create(p, nhttp); + p += sz; + p = (void*)PRNDUP(p); + assert(p < bo->end); + bo->bereq = HTTP_create(p, nhttp); p += sz; p = (void*)PRNDUP(p); diff --git a/bin/varnishd/cache/cache_fetch.c b/bin/varnishd/cache/cache_fetch.c index a557c23..5df6096 100644 --- a/bin/varnishd/cache/cache_fetch.c +++ b/bin/varnishd/cache/cache_fetch.c @@ -60,7 +60,6 @@ vbf_release_req(struct req ***reqpp) static enum fetch_step vbf_stp_mkbereq(struct worker *wrk, struct busyobj *bo, const struct req *req) { - int i; CHECK_OBJ_NOTNULL(wrk, WORKER_MAGIC); CHECK_OBJ_NOTNULL(req, REQ_MAGIC); @@ -71,23 +70,46 @@ vbf_stp_mkbereq(struct worker *wrk, struct busyobj *bo, const struct req *req) AZ(bo->should_close); AZ(bo->storage_hint); - HTTP_Setup(bo->bereq, bo->ws, bo->vsl, HTTP_Bereq); - http_FilterReq(bo->bereq, req->http, + HTTP_Setup(bo->bereq0, bo->ws, bo->vsl, HTTP_Bereq); + http_FilterReq(bo->bereq0, req->http, bo->do_pass ? HTTPH_R_PASS : HTTPH_R_FETCH); if (!bo->do_pass) { // XXX: Forcing GET should happen in vcl_miss{} ? - http_ForceGet(bo->bereq); + http_ForceGet(bo->bereq0); if (cache_param->http_gzip_support) { /* * We always ask the backend for gzip, even if the * client doesn't grok it. We will uncompress for * the minority of clients which don't. */ - http_Unset(bo->bereq, H_Accept_Encoding); - http_SetHeader(bo->bereq, "Accept-Encoding: gzip"); + http_Unset(bo->bereq0, H_Accept_Encoding); + http_SetHeader(bo->bereq0, "Accept-Encoding: gzip"); } } + return (F_STP_STARTFETCH); +} + +/*-------------------------------------------------------------------- + * Copy req->bereq and run it by VCL::vcl_backend_fetch{} + */ + +static enum fetch_step +vbf_stp_startfetch(struct worker *wrk, struct busyobj *bo) +{ + int i; + + CHECK_OBJ_NOTNULL(wrk, WORKER_MAGIC); + CHECK_OBJ_NOTNULL(bo, BUSYOBJ_MAGIC); + + AN(bo->director); + AZ(bo->vbc); + AZ(bo->should_close); + AZ(bo->storage_hint); + + HTTP_Setup(bo->bereq, bo->ws, bo->vsl, HTTP_Bereq); + HTTP_Copy(bo->bereq, bo->bereq0); + // Don't let VCL reset do_pass i = bo->do_pass; VCL_backend_fetch_method(bo->vcl, wrk, NULL, bo, bo->bereq->ws); diff --git a/bin/varnishtest/tests/r01038.vtc b/bin/varnishtest/tests/r01038.vtc index ea04142..bb57072 100644 --- a/bin/varnishtest/tests/r01038.vtc +++ b/bin/varnishtest/tests/r01038.vtc @@ -45,7 +45,7 @@ server s1 { txresp -body "foo8" } -start -varnish v1 -arg "-p workspace_backend=8k" -vcl+backend { +varnish v1 -arg "-p workspace_backend=9k" -vcl+backend { sub vcl_backend_response { set beresp.do_esi = true; } diff --git a/include/tbl/steps.h b/include/tbl/steps.h index b0589ff..9767191 100644 --- a/include/tbl/steps.h +++ b/include/tbl/steps.h @@ -51,6 +51,7 @@ REQ_STEP(error, ERROR, (wrk, req)) #ifdef FETCH_STEP FETCH_STEP(mkbereq, MKBEREQ, (wrk, bo, *reqp)) +FETCH_STEP(startfetch, STARTFETCH, (wrk, bo)) FETCH_STEP(fetchhdr, FETCHHDR, (wrk, bo, &reqp)) FETCH_STEP(fetch, FETCH, (wrk, bo)) FETCH_STEP(abandon, ABANDON, (wrk, bo, &reqp)) From martin at varnish-cache.org Wed Jun 26 12:45:26 2013 From: martin at varnish-cache.org (Martin Blix Grydeland) Date: Wed, 26 Jun 2013 14:45:26 +0200 Subject: [master] 76ee005 Emit VSL warnings as synth log records while scanning the shmlog Message-ID: commit 76ee00515bb6b31c3f648aef03785914fe001475 Author: Martin Blix Grydeland Date: Wed Jun 26 14:42:23 2013 +0200 Emit VSL warnings as synth log records while scanning the shmlog diff --git a/include/tbl/vsl_tags.h b/include/tbl/vsl_tags.h index 25ba677..827a231 100644 --- a/include/tbl/vsl_tags.h +++ b/include/tbl/vsl_tags.h @@ -184,3 +184,8 @@ SLTM(Begin, "Marks the start of a VXID", SLTM(End, "Marks the end of a VXID", "The last record of a VXID transaction.\n" ) + +SLTM(VSL, "Internally generated VSL API warnings and error message", + "Warnings and error messages genererated by the VSL API while reading the" + " shared memory log" +) diff --git a/lib/libvarnishapi/vsl_dispatch.c b/lib/libvarnishapi/vsl_dispatch.c index 79fb02a..b74c1d45 100644 --- a/lib/libvarnishapi/vsl_dispatch.c +++ b/lib/libvarnishapi/vsl_dispatch.c @@ -135,6 +135,10 @@ struct VSLQ { unsigned n_cache; }; +static int vtx_diag(struct VSLQ *vslq, struct vtx *vtx, const char *fmt, ...); +static int vtx_diag_tag(struct VSLQ *vslq, struct vtx *vtx, const uint32_t *ptr, + const char *reason); + static inline int vtx_keycmp(const struct vtx_key *a, const struct vtx_key *b) { @@ -149,29 +153,6 @@ VRB_PROTOTYPE(vtx_tree, vtx_key, entry, vtx_keycmp); VRB_GENERATE(vtx_tree, vtx_key, entry, vtx_keycmp); static int -vtx_diag(struct vtx *vtx, const char *fmt, ...) -{ - va_list ap; - - /* XXX: Prepend diagnostic message on vtx as a synthetic log - record. For now print to stderr */ - fprintf(stderr, "vtx_diag <%u>: ", vtx->key.vxid); - va_start(ap, fmt); - vfprintf(stderr, fmt, ap); - va_end(ap); - fprintf(stderr, "\n"); - - return (-1); -} - -static int -vtx_diag_tag(struct vtx *vtx, const uint32_t *ptr, const char *reason) -{ - return (vtx_diag(vtx, "%s (%s: %.*s)", reason, VSL_tags[VSL_TAG(ptr)], - (int)VSL_LEN(ptr), VSL_CDATA(ptr))); -} - -static int vslc_raw_next(void *cursor) { struct vslc_raw *c; @@ -401,6 +382,7 @@ vtx_set_bufsize(struct vtx *vtx, ssize_t len) while (vtx->bufsize < len) vtx->bufsize *= 2; vtx->buf = realloc(vtx->buf, sizeof (uint32_t) * vtx->bufsize); + AN(vtx->buf); } static void @@ -471,7 +453,7 @@ vtx_check_ready(struct VSLQ *vslq, struct vtx *vtx) AZ(vtx->flags & VTX_F_READY); if (vtx->type == VSL_t_unknown) - vtx_diag(vtx, "vtx of unknown type marked complete"); + vtx_diag(vslq, vtx, "vtx of unknown type marked complete"); ready = vtx; while (1) { @@ -555,16 +537,16 @@ vtx_scan_begintag(struct VSLQ *vslq, struct vtx *vtx, const uint32_t *ptr) assert(VSL_TAG(ptr) == SLT_Begin); if (vtx->flags & VTX_F_READY) - return (vtx_diag_tag(vtx, ptr, "link too late")); + return (vtx_diag_tag(vslq, vtx, ptr, "link too late")); i = vtx_parsetag_bl(VSL_CDATA(ptr), VSL_LEN(ptr), &type, &p_vxid); if (i < 1) - return (vtx_diag_tag(vtx, ptr, "parse error")); + return (vtx_diag_tag(vslq, vtx, ptr, "parse error")); /* Check/set vtx type */ assert(type != VSL_t_unknown); if (vtx->type != VSL_t_unknown && vtx->type != type) - return (vtx_diag_tag(vtx, ptr, "type mismatch")); + return (vtx_diag_tag(vslq, vtx, ptr, "type mismatch")); vtx->type = type; if (i == 1 || p_vxid == 0) @@ -583,9 +565,9 @@ vtx_scan_begintag(struct VSLQ *vslq, struct vtx *vtx, const uint32_t *ptr) return (0); if (vtx->parent != NULL) - return (vtx_diag_tag(vtx, ptr, "duplicate link")); + return (vtx_diag_tag(vslq, vtx, ptr, "duplicate link")); if (p_vtx->flags & VTX_F_READY) - return (vtx_diag_tag(vtx, ptr, "link too late")); + return (vtx_diag_tag(vslq, vtx, ptr, "link too late")); vtx_set_parent(p_vtx, vtx); @@ -603,11 +585,11 @@ vtx_scan_linktag(struct VSLQ *vslq, struct vtx *vtx, const uint32_t *ptr) assert(VSL_TAG(ptr) == SLT_Link); if (vtx->flags & VTX_F_READY) - return (vtx_diag_tag(vtx, ptr, "link too late")); + return (vtx_diag_tag(vslq, vtx, ptr, "link too late")); i = vtx_parsetag_bl(VSL_CDATA(ptr), VSL_LEN(ptr), &c_type, &c_vxid); if (i < 2) - return (vtx_diag_tag(vtx, ptr, "parse error")); + return (vtx_diag_tag(vslq, vtx, ptr, "parse error")); assert(i == 2); if (vslq->grouping == VSL_g_vxid) @@ -622,11 +604,11 @@ vtx_scan_linktag(struct VSLQ *vslq, struct vtx *vtx, const uint32_t *ptr) /* Link already exists */ return (0); if (c_vtx->parent != NULL) - return (vtx_diag_tag(vtx, ptr, "duplicate link")); + return (vtx_diag_tag(vslq, vtx, ptr, "duplicate link")); if (c_vtx->flags & VTX_F_READY) - return (vtx_diag_tag(vtx, ptr, "link too late")); + return (vtx_diag_tag(vslq, vtx, ptr, "link too late")); if (c_vtx->type != VSL_t_unknown && c_vtx->type != c_type) - return (vtx_diag_tag(vtx, ptr, "type mismatch")); + return (vtx_diag_tag(vslq, vtx, ptr, "type mismatch")); c_vtx->type = c_type; vtx_set_parent(vtx, c_vtx); @@ -648,13 +630,17 @@ vtx_scan(struct VSLQ *vslq, struct vtx *vtx) if (tag == SLT__Batch) continue; + if (tag == SLT_VSL) + /* Don't process these to avoid looping */ + continue; + if (vtx->flags & VTX_F_COMPLETE) { - vtx_diag_tag(vtx, ptr, "late log rec"); + vtx_diag_tag(vslq, vtx, ptr, "late log rec"); continue; } if (vtx->type == VSL_t_unknown && tag != SLT_Begin) - vtx_diag_tag(vtx, ptr, "early log rec"); + vtx_diag_tag(vslq, vtx, ptr, "early log rec"); switch (tag) { case SLT_Begin: @@ -688,7 +674,7 @@ vtx_force(struct VSLQ *vslq, struct vtx *vtx, const char *reason) { AZ(vtx->flags & VTX_F_COMPLETE); AZ(vtx->flags & VTX_F_READY); - vtx_diag(vtx, reason); + vtx_diag(vslq, vtx, reason); VTAILQ_REMOVE(&vslq->incomplete, vtx, list_incomplete); vtx->flags |= VTX_F_COMPLETE; @@ -756,6 +742,38 @@ vslq_callback(struct VSLQ *vslq, struct vtx *vtx, VSLQ_dispatch_f *func, return ((func)(vslq->vsl, ptrans, priv)); } +static int +vtx_diag(struct VSLQ *vslq, struct vtx *vtx, const char *fmt, ...) +{ + va_list ap; + uint32_t buf[256]; + int i, len; + struct VSLC_ptr rec; + + len = sizeof buf - 2 * sizeof (uint32_t); + va_start(ap, fmt); + i = vsnprintf((char *)&buf[2], len, fmt, ap); + assert(i >= 0); + va_end(ap); + if (i < len) + len = i; + buf[1] = vtx->key.vxid; + buf[0] = ((((unsigned)SLT_VSL & 0xff) << 24) | len); + rec.ptr = buf; + rec.priv = 0; + vtx_append(vslq, vtx, &rec, VSL_NEXT(rec.ptr) - rec.ptr, 1); + + return (-1); +} + +static int +vtx_diag_tag(struct VSLQ *vslq, struct vtx *vtx, const uint32_t *ptr, + const char *reason) +{ + return (vtx_diag(vslq, vtx, "%s (%s: %.*s)", reason, + VSL_tags[VSL_TAG(ptr)], (int)VSL_LEN(ptr), VSL_CDATA(ptr))); +} + struct VSLQ * VSLQ_New(struct VSL_data *vsl, struct VSL_cursor **cp, enum VSL_grouping_e grouping, const char *querystring) @@ -904,7 +922,7 @@ VSLQ_Dispatch(struct VSLQ *vslq, VSLQ_dispatch_f *func, void *priv) i = VSL_Next(c); if (i != 1) - return (i); + break; tag = VSL_TAG(c->rec.ptr); if (tag == SLT__Batch) { vxid = VSL_BATCHID(c->rec.ptr); @@ -928,7 +946,7 @@ VSLQ_Dispatch(struct VSLQ *vslq, VSLQ_dispatch_f *func, void *priv) vtx_retire(vslq, &vtx); AZ(vtx); if (i) - return (i); + break; } } if (i) From phk at varnish-cache.org Thu Jun 27 10:15:30 2013 From: phk at varnish-cache.org (Poul-Henning Kamp) Date: Thu, 27 Jun 2013 12:15:30 +0200 Subject: [master] a28ad1e Ignore "Batch" VSL records, they clutter the output when using "debug +syncvsl" Message-ID: commit a28ad1e04889f26c913736821ef464056ea8813a Author: Poul-Henning Kamp Date: Thu Jun 27 10:14:44 2013 +0000 Ignore "Batch" VSL records, they clutter the output when using "debug +syncvsl" diff --git a/bin/varnishtest/vtc_varnish.c b/bin/varnishtest/vtc_varnish.c index d303943..21bd04b 100644 --- a/bin/varnishtest/vtc_varnish.c +++ b/bin/varnishtest/vtc_varnish.c @@ -232,6 +232,7 @@ varnishlog_thread(void *priv) if (tag == SLT__Batch) { tagname = "Batch"; len = 0; + continue; } else { tagname = VSL_tags[tag]; len = VSL_LEN(c->rec.ptr); From phk at varnish-cache.org Thu Jun 27 10:16:10 2013 From: phk at varnish-cache.org (Poul-Henning Kamp) Date: Thu, 27 Jun 2013 12:16:10 +0200 Subject: [master] 15ee0c7 Implement backend_fetch_response( return (retry); ) Message-ID: commit 15ee0c7cf73365cc675f0fa332a1acd3450153d8 Author: Poul-Henning Kamp Date: Thu Jun 27 10:15:51 2013 +0000 Implement backend_fetch_response( return (retry); ) diff --git a/bin/varnishd/cache/cache_fetch.c b/bin/varnishd/cache/cache_fetch.c index 5df6096..fe12452 100644 --- a/bin/varnishd/cache/cache_fetch.c +++ b/bin/varnishd/cache/cache_fetch.c @@ -58,7 +58,8 @@ vbf_release_req(struct req ***reqpp) */ static enum fetch_step -vbf_stp_mkbereq(struct worker *wrk, struct busyobj *bo, const struct req *req) +vbf_stp_mkbereq(const struct worker *wrk, struct busyobj *bo, + const struct req *req) { CHECK_OBJ_NOTNULL(wrk, WORKER_MAGIC); @@ -205,6 +206,14 @@ vbf_stp_fetchhdr(struct worker *wrk, struct busyobj *bo, struct req ***reqpp) if (wrk->handling == VCL_RET_DELIVER) return (F_STP_FETCH); + if (wrk->handling == VCL_RET_RETRY) { + bo->retries++; + if (bo->retries <= cache_param->max_retries) { + VDI_CloseFd(&bo->vbc); + return (F_STP_STARTFETCH); + } + // XXX: wrk->handling = VCL_RET_SYNTH; + } return (F_STP_NOTYET); } diff --git a/bin/varnishd/common/params.h b/bin/varnishd/common/params.h index ebeff0f..c00d427 100644 --- a/bin/varnishd/common/params.h +++ b/bin/varnishd/common/params.h @@ -143,6 +143,9 @@ struct params { /* Maximum restarts allowed */ unsigned max_restarts; + /* Maximum backend retriesallowed */ + unsigned max_retries; + /* Maximum esi:include depth allowed */ unsigned max_esi_depth; diff --git a/bin/varnishd/mgt/mgt_param_tbl.c b/bin/varnishd/mgt/mgt_param_tbl.c index 01534dc..82ed98f 100644 --- a/bin/varnishd/mgt/mgt_param_tbl.c +++ b/bin/varnishd/mgt/mgt_param_tbl.c @@ -319,6 +319,10 @@ const struct parspec mgt_parspec[] = { "the backend, so don't increase thoughtlessly.\n", 0, "4", "restarts" }, + { "max_retries", tweak_uint, &mgt_param.max_retries, 0, UINT_MAX, + "Upper limit on how many times a backend fetch can retry.\n", + 0, + "4", "retries" }, { "esi_syntax", tweak_uint, &mgt_param.esi_syntax, 0, UINT_MAX, "Bitmap controlling ESI parsing code:\n" diff --git a/bin/varnishtest/tests/c00056.vtc b/bin/varnishtest/tests/c00056.vtc new file mode 100644 index 0000000..694e6c8 --- /dev/null +++ b/bin/varnishtest/tests/c00056.vtc @@ -0,0 +1,50 @@ +varnishtest "vcl_backend_response{} retry" + +server s1 { + rxreq + txresp -hdr "foo: 1" + accept + rxreq + txresp -hdr "foo: 2" +} -start + +varnish v1 -vcl+backend { + sub vcl_recv { return (pass); } + sub vcl_backend_response { + set beresp.http.bar = bereq.retries; + if (beresp.http.foo != bereq.http.stop) { + return (retry); + } + } +} -start + +varnish v1 -cliok "param.set debug +syncvsl" + +client c1 { + txreq -hdr "stop: 2" + rxresp + expect resp.http.foo == 2 +} -run + +delay .1 + +server s1 { + rxreq + txresp -hdr "foo: 1" + accept + rxreq + txresp -hdr "foo: 2" + accept + rxreq + txresp -hdr "foo: 3" +} -start + +varnish v1 -cliok "param.set max_retries 2" + +client c1 { + txreq -hdr "stop: 3" + rxresp + expect resp.http.foo == 3 +} -run + +# XXX: Add a test which exceeds max_retries and gets 503 back diff --git a/bin/varnishtest/tests/v00017.vtc b/bin/varnishtest/tests/v00017.vtc index b8553e6..ba0c8e4 100644 --- a/bin/varnishtest/tests/v00017.vtc +++ b/bin/varnishtest/tests/v00017.vtc @@ -67,7 +67,7 @@ varnish v1 -vcl { varnish v1 -vcl { backend b { .host = "127.0.0.1"; } sub vcl_recv { if (client.ip == "127.0.0.1") { return(pass); } } - sub vcl_backend_response { if (client.ip != "127.0.0.1") { return(restart); } } + sub vcl_backend_response { if (client.ip != "127.0.0.1") { return(retry); } } } varnish v1 -errvcl {Operator > not possible on IP} { diff --git a/lib/libvcl/generate.py b/lib/libvcl/generate.py index d358351..ef3f109 100755 --- a/lib/libvcl/generate.py +++ b/lib/libvcl/generate.py @@ -86,7 +86,7 @@ returns =( ('miss', "C", ('error', 'restart', 'pass', 'fetch',)), ('lookup', "C", ('error', 'restart', 'pass', 'deliver',)), ('backend_fetch', "B", ('fetch', 'abandon')), - ('backend_response', "B", ('deliver', 'restart', 'error')), + ('backend_response', "B", ('deliver', 'retry', 'abandon')), ('deliver', "C", ('restart', 'deliver',)), ('error', "C", ('restart', 'deliver',)), ('init', "", ('ok',)), From phk at varnish-cache.org Sat Jun 29 11:12:10 2013 From: phk at varnish-cache.org (Poul-Henning Kamp) Date: Sat, 29 Jun 2013 13:12:10 +0200 Subject: [master] 6b717fb Make __match_proto__() macro globally visible Message-ID: commit 6b717fb150b3343df96b8ca5a67e7912b3db0d26 Author: Poul-Henning Kamp Date: Sat Jun 29 11:09:37 2013 +0000 Make __match_proto__() macro globally visible diff --git a/bin/varnishd/common/common.h b/bin/varnishd/common/common.h index 978b54d..43af1c3 100644 --- a/bin/varnishd/common/common.h +++ b/bin/varnishd/common/common.h @@ -57,13 +57,6 @@ struct cli; */ /* - * In OO-light situations, functions have to match their prototype - * even if that means not const'ing a const'able argument. - * The typedef should be specified as argument to the macro. - */ -#define __match_proto__(xxx) /*lint -e{818} */ - -/* * State variables may change value before we use the last value we * set them to. * Pass no argument. diff --git a/include/vdef.h b/include/vdef.h index 9beedb0..89694b2 100644 --- a/include/vdef.h +++ b/include/vdef.h @@ -67,4 +67,16 @@ # endif #endif +/********************************************************************** + * FlexeLint and compiler shutuppery + */ + +/* + * In OO-light situations, functions have to match their prototype + * even if that means not const'ing a const'able argument. + * The typedef should be specified as argument to the macro. + */ +#define __match_proto__(xxx) /*lint -e{818} */ + + #endif /* VDEF_H_INCLUDED */ From phk at varnish-cache.org Sat Jun 29 11:12:10 2013 From: phk at varnish-cache.org (Poul-Henning Kamp) Date: Sat, 29 Jun 2013 13:12:10 +0200 Subject: [master] 7dda86f Add a NULL pointer check Message-ID: commit 7dda86f17380e3ae88adc4b5c2f6f3d56c8ed018 Author: Poul-Henning Kamp Date: Sat Jun 29 11:08:55 2013 +0000 Add a NULL pointer check diff --git a/lib/libvarnishapi/vsl_query.c b/lib/libvarnishapi/vsl_query.c index 20efff4..9685486 100644 --- a/lib/libvarnishapi/vsl_query.c +++ b/lib/libvarnishapi/vsl_query.c @@ -66,7 +66,8 @@ vslq_newquery(struct VSL_data *vsl, enum VSL_grouping_e grouping, } ALLOC_OBJ(query, VSLQ_QUERY_MAGIC); - query->regex = regex; + if (query != NULL) + query->regex = regex; return (query); } From phk at varnish-cache.org Sat Jun 29 11:12:10 2013 From: phk at varnish-cache.org (Poul-Henning Kamp) Date: Sat, 29 Jun 2013 13:12:10 +0200 Subject: [master] f233678 Add an assert for the benefit of static analysers Message-ID: commit f2336782f6f8a51ff21d61e02e9827b212fc0108 Author: Poul-Henning Kamp Date: Sat Jun 29 11:10:09 2013 +0000 Add an assert for the benefit of static analysers diff --git a/include/vtree.h b/include/vtree.h index 7692d12..84af261 100644 --- a/include/vtree.h +++ b/include/vtree.h @@ -584,8 +584,10 @@ name##_VRB_REMOVE(struct name *head, struct type *elm) \ } else \ VRB_ROOT(head) = child; \ color: \ - if (color == VRB_BLACK) \ + if (color == VRB_BLACK) { \ + AN(parent); \ name##_VRB_REMOVE_COLOR(head, parent, child); \ + } \ return (old); \ } \ \ From phk at varnish-cache.org Sat Jun 29 11:12:10 2013 From: phk at varnish-cache.org (Poul-Henning Kamp) Date: Sat, 29 Jun 2013 13:12:10 +0200 Subject: [master] c6b4d18 Make the records static in the VSLQ_dispatch_f() api Message-ID: commit c6b4d18128aa6b79b1e6c3cba9df7c4c127a3d90 Author: Poul-Henning Kamp Date: Sat Jun 29 11:11:31 2013 +0000 Make the records static in the VSLQ_dispatch_f() api Use the and mark __match_proto__() the typedef where applicable. diff --git a/bin/varnishtest/vtc_logexp.c b/bin/varnishtest/vtc_logexp.c index fc58cae..d90fa1e 100644 --- a/bin/varnishtest/vtc_logexp.c +++ b/bin/varnishtest/vtc_logexp.c @@ -57,7 +57,6 @@ #include "vapi/vsl.h" #include "vtim.h" #include "vqueue.h" -#include "miniobj.h" #include "vas.h" #include "vre.h" @@ -176,8 +175,9 @@ logexp_next(struct logexp *le) vtc_log(le->vl, 3, "tst| %s", VSB_data(le->test->str)); } -static int -logexp_dispatch(struct VSL_data *vsl, struct VSL_transaction *pt[], void *priv) +static int __match_proto__(VSLQ_dispatch_f) +logexp_dispatch(struct VSL_data *vsl, struct VSL_transaction * const pt[], + void *priv) { struct logexp *le; struct VSL_transaction *t; @@ -275,7 +275,7 @@ logexp_thread(void *priv) AZ(le->test); logexp_next(le); while (le->test) { - i = VSLQ_Dispatch(le->vslq, &logexp_dispatch, le); + i = VSLQ_Dispatch(le->vslq, logexp_dispatch, le); if (i < 0) vtc_log(le->vl, 0, "dispatch: %d", i); if (i == 0 && le->test) diff --git a/include/vapi/vsl.h b/include/vapi/vsl.h index 30507c0..138e920 100644 --- a/include/vapi/vsl.h +++ b/include/vapi/vsl.h @@ -84,6 +84,21 @@ enum VSL_grouping_e { VSL_g_session, }; +typedef int VSLQ_dispatch_f(struct VSL_data *vsl, + struct VSL_transaction * const trans[], void *priv); + /* + * The callback function type for use with VSLQ_Dispatch. + * + * Arguments: + * vsl: The VSL_data context + * trans[]: A NULL terminated array of pointers to VSL_transaction. + * priv: The priv argument from VSL_Dispatch + * + * Return value: + * 0: OK - continue + * !=0: Makes VSLQ_Dispatch return with this return value immediatly + */ + extern const char *VSL_tags[256]; /* * Tag to string array. Contains NULL for invalid tags. @@ -260,8 +275,7 @@ int VSL_PrintAll(struct VSL_data *vsl, struct VSL_cursor *c, void *fo); * !=0: Return value from either VSL_Next or VSL_Print */ -int VSL_PrintTransactions(struct VSL_data *vsl, - struct VSL_transaction *ptrans[], void *fo); +VSLQ_dispatch_f VSL_PrintTransactions; /* * Prints out each transaction in the array ptrans. For * transactions of level > 0 it will print a header before the log @@ -316,8 +330,7 @@ int VSL_WriteAll(struct VSL_data *vsl, struct VSL_cursor *c, void *fo); * !=0: Return value from either VSL_Next or VSL_Write */ -int VSL_WriteTransactions(struct VSL_data *vsl, - struct VSL_transaction *ptrans[], void *fo); +VSLQ_dispatch_f VSL_WriteTransactions; /* * Write all transactions in ptrans using VSL_WriteAll * Return values: @@ -347,21 +360,6 @@ void VSLQ_Delete(struct VSLQ **pvslq); * Delete the query pointed to by pvslq, freeing up the resources */ -typedef int VSLQ_dispatch_f(struct VSL_data *vsl, - struct VSL_transaction *trans[], void *priv); - /* - * The callback function type for use with VSLQ_Dispatch. - * - * Arguments: - * vsl: The VSL_data context - * trans[]: A NULL terminated array of pointers to VSL_transaction. - * priv: The priv argument from VSL_Dispatch - * - * Return value: - * 0: OK - continue - * !=0: Makes VSLQ_Dispatch return with this return value immediatly - */ - int VSLQ_Dispatch(struct VSLQ *vslq, VSLQ_dispatch_f *func, void *priv); /* * Process log and call func for each set matching the specified diff --git a/lib/libvarnishapi/vsl.c b/lib/libvarnishapi/vsl.c index f263862..501e19a 100644 --- a/lib/libvarnishapi/vsl.c +++ b/lib/libvarnishapi/vsl.c @@ -307,8 +307,8 @@ VSL_PrintAll(struct VSL_data *vsl, struct VSL_cursor *c, void *fo) } } -int -VSL_PrintTransactions(struct VSL_data *vsl, struct VSL_transaction *pt[], +int __match_proto__(VSLQ_dispatch_f) +VSL_PrintTransactions(struct VSL_data *vsl, struct VSL_transaction * const pt[], void *fo) { struct VSL_transaction *t; @@ -425,8 +425,8 @@ VSL_WriteAll(struct VSL_data *vsl, struct VSL_cursor *c, void *fo) } } -int -VSL_WriteTransactions(struct VSL_data *vsl, struct VSL_transaction *pt[], +int __match_proto__(VSLQ_dispatch_f) +VSL_WriteTransactions(struct VSL_data *vsl, struct VSL_transaction * const pt[], void *fo) { struct VSL_transaction *t; diff --git a/lib/libvarnishapi/vsl_cursor.c b/lib/libvarnishapi/vsl_cursor.c index 443a176..ca1bce7 100644 --- a/lib/libvarnishapi/vsl_cursor.c +++ b/lib/libvarnishapi/vsl_cursor.c @@ -270,6 +270,7 @@ VSL_CursorVSM(struct VSL_data *vsl, struct VSM_data *vsm, int tail) } else AZ(vslc_vsm_reset(&c->c)); + /* XXX: How does 'c' ever get freed ? */ return (&c->c.c); } From phk at varnish-cache.org Sat Jun 29 12:53:02 2013 From: phk at varnish-cache.org (Poul-Henning Kamp) Date: Sat, 29 Jun 2013 14:53:02 +0200 Subject: [master] 184c26b white space Message-ID: commit 184c26baa3b57d6e0a1e8647d2996ede93b5ab53 Author: Poul-Henning Kamp Date: Sat Jun 29 12:47:16 2013 +0000 white space diff --git a/include/vtree.h b/include/vtree.h index 84af261..da9d579 100644 --- a/include/vtree.h +++ b/include/vtree.h @@ -62,7 +62,7 @@ struct name { \ struct type *sph_root; /* root of the tree */ \ } -#define VSPLAY_INITIALIZER(root) \ +#define VSPLAY_INITIALIZER(root) \ { NULL } #define VSPLAY_INIT(root) do { \ @@ -82,13 +82,13 @@ struct { \ /* VSPLAY_ROTATE_{LEFT,RIGHT} expect that tmp hold VSPLAY_{RIGHT,LEFT} */ #define VSPLAY_ROTATE_RIGHT(head, tmp, field) do { \ - VSPLAY_LEFT((head)->sph_root, field) = VSPLAY_RIGHT(tmp, field); \ + VSPLAY_LEFT((head)->sph_root, field) = VSPLAY_RIGHT(tmp, field);\ VSPLAY_RIGHT(tmp, field) = (head)->sph_root; \ (head)->sph_root = tmp; \ } while (/*CONSTCOND*/ 0) #define VSPLAY_ROTATE_LEFT(head, tmp, field) do { \ - VSPLAY_RIGHT((head)->sph_root, field) = VSPLAY_LEFT(tmp, field); \ + VSPLAY_RIGHT((head)->sph_root, field) = VSPLAY_LEFT(tmp, field);\ VSPLAY_LEFT(tmp, field) = (head)->sph_root; \ (head)->sph_root = tmp; \ } while (/*CONSTCOND*/ 0) @@ -96,7 +96,7 @@ struct { \ #define VSPLAY_LINKLEFT(head, tmp, field) do { \ VSPLAY_LEFT(tmp, field) = (head)->sph_root; \ tmp = (head)->sph_root; \ - (head)->sph_root = VSPLAY_LEFT((head)->sph_root, field); \ + (head)->sph_root = VSPLAY_LEFT((head)->sph_root, field); \ } while (/*CONSTCOND*/ 0) #define VSPLAY_LINKRIGHT(head, tmp, field) do { \ @@ -106,19 +106,19 @@ struct { \ } while (/*CONSTCOND*/ 0) #define VSPLAY_ASSEMBLE(head, node, left, right, field) do { \ - VSPLAY_RIGHT(left, field) = VSPLAY_LEFT((head)->sph_root, field); \ + VSPLAY_RIGHT(left, field) = VSPLAY_LEFT((head)->sph_root, field);\ VSPLAY_LEFT(right, field) = VSPLAY_RIGHT((head)->sph_root, field);\ - VSPLAY_LEFT((head)->sph_root, field) = VSPLAY_RIGHT(node, field); \ - VSPLAY_RIGHT((head)->sph_root, field) = VSPLAY_LEFT(node, field); \ + VSPLAY_LEFT((head)->sph_root, field) = VSPLAY_RIGHT(node, field);\ + VSPLAY_RIGHT((head)->sph_root, field) = VSPLAY_LEFT(node, field);\ } while (/*CONSTCOND*/ 0) /* Generates prototypes and inline functions */ -#define VSPLAY_PROTOTYPE(name, type, field, cmp) \ +#define VSPLAY_PROTOTYPE(name, type, field, cmp) \ void name##_VSPLAY(struct name *, struct type *); \ void name##_VSPLAY_MINMAX(struct name *, int); \ -struct type *name##_VSPLAY_INSERT(struct name *, struct type *); \ -struct type *name##_VSPLAY_REMOVE(struct name *, struct type *); \ +struct type *name##_VSPLAY_INSERT(struct name *, struct type *); \ +struct type *name##_VSPLAY_REMOVE(struct name *, struct type *); \ \ /* Finds the node with the same key as elm */ \ static __inline struct type * \ @@ -149,7 +149,7 @@ name##_VSPLAY_NEXT(struct name *head, struct type *elm) \ static __inline struct type * \ name##_VSPLAY_MIN_MAX(struct name *head, int val) \ { \ - name##_VSPLAY_MINMAX(head, val); \ + name##_VSPLAY_MINMAX(head, val); \ return (VSPLAY_ROOT(head)); \ } @@ -168,8 +168,8 @@ name##_VSPLAY_INSERT(struct name *head, struct type *elm) \ __comp = (cmp)(elm, (head)->sph_root); \ if(__comp < 0) { \ VSPLAY_LEFT(elm, field) = VSPLAY_LEFT((head)->sph_root, field);\ - VSPLAY_RIGHT(elm, field) = (head)->sph_root; \ - VSPLAY_LEFT((head)->sph_root, field) = NULL; \ + VSPLAY_RIGHT(elm, field) = (head)->sph_root; \ + VSPLAY_LEFT((head)->sph_root, field) = NULL; \ } else if (__comp > 0) { \ VSPLAY_RIGHT(elm, field) = VSPLAY_RIGHT((head)->sph_root, field);\ VSPLAY_LEFT(elm, field) = (head)->sph_root; \ @@ -217,7 +217,7 @@ name##_VSPLAY(struct name *head, struct type *elm) \ if (__tmp == NULL) \ break; \ if ((cmp)(elm, __tmp) < 0){ \ - VSPLAY_ROTATE_RIGHT(head, __tmp, field); \ + VSPLAY_ROTATE_RIGHT(head, __tmp, field);\ if (VSPLAY_LEFT((head)->sph_root, field) == NULL)\ break; \ } \ @@ -253,7 +253,7 @@ void name##_VSPLAY_MINMAX(struct name *head, int __comp) \ if (__tmp == NULL) \ break; \ if (__comp < 0){ \ - VSPLAY_ROTATE_RIGHT(head, __tmp, field); \ + VSPLAY_ROTATE_RIGHT(head, __tmp, field);\ if (VSPLAY_LEFT((head)->sph_root, field) == NULL)\ break; \ } \ @@ -320,15 +320,15 @@ struct { \ #define VRB_ROOT(head) (head)->rbh_root #define VRB_EMPTY(head) (VRB_ROOT(head) == NULL) -#define VRB_SET(elm, parent, field) do { \ - VRB_PARENT(elm, field) = parent; \ +#define VRB_SET(elm, parent, field) do { \ + VRB_PARENT(elm, field) = parent; \ VRB_LEFT(elm, field) = VRB_RIGHT(elm, field) = NULL; \ - VRB_COLOR(elm, field) = VRB_RED; \ + VRB_COLOR(elm, field) = VRB_RED; \ } while (/*CONSTCOND*/ 0) -#define VRB_SET_BLACKRED(black, red, field) do { \ +#define VRB_SET_BLACKRED(black, red, field) do { \ VRB_COLOR(black, field) = VRB_BLACK; \ - VRB_COLOR(red, field) = VRB_RED; \ + VRB_COLOR(red, field) = VRB_RED; \ } while (/*CONSTCOND*/ 0) #ifndef VRB_AUGMENT @@ -338,14 +338,14 @@ struct { \ #define VRB_ROTATE_LEFT(head, elm, tmp, field) do { \ (tmp) = VRB_RIGHT(elm, field); \ if ((VRB_RIGHT(elm, field) = VRB_LEFT(tmp, field)) != NULL) { \ - VRB_PARENT(VRB_LEFT(tmp, field), field) = (elm); \ + VRB_PARENT(VRB_LEFT(tmp, field), field) = (elm); \ } \ VRB_AUGMENT(elm); \ - if ((VRB_PARENT(tmp, field) = VRB_PARENT(elm, field)) != NULL) { \ + if ((VRB_PARENT(tmp, field) = VRB_PARENT(elm, field)) != NULL) {\ if ((elm) == VRB_LEFT(VRB_PARENT(elm, field), field)) \ - VRB_LEFT(VRB_PARENT(elm, field), field) = (tmp); \ + VRB_LEFT(VRB_PARENT(elm, field), field) = (tmp);\ else \ - VRB_RIGHT(VRB_PARENT(elm, field), field) = (tmp); \ + VRB_RIGHT(VRB_PARENT(elm, field), field) = (tmp);\ } else \ (head)->rbh_root = (tmp); \ VRB_LEFT(tmp, field) = (elm); \ @@ -358,14 +358,14 @@ struct { \ #define VRB_ROTATE_RIGHT(head, elm, tmp, field) do { \ (tmp) = VRB_LEFT(elm, field); \ if ((VRB_LEFT(elm, field) = VRB_RIGHT(tmp, field)) != NULL) { \ - VRB_PARENT(VRB_RIGHT(tmp, field), field) = (elm); \ + VRB_PARENT(VRB_RIGHT(tmp, field), field) = (elm); \ } \ VRB_AUGMENT(elm); \ - if ((VRB_PARENT(tmp, field) = VRB_PARENT(elm, field)) != NULL) { \ + if ((VRB_PARENT(tmp, field) = VRB_PARENT(elm, field)) != NULL) {\ if ((elm) == VRB_LEFT(VRB_PARENT(elm, field), field)) \ - VRB_LEFT(VRB_PARENT(elm, field), field) = (tmp); \ + VRB_LEFT(VRB_PARENT(elm, field), field) = (tmp);\ else \ - VRB_RIGHT(VRB_PARENT(elm, field), field) = (tmp); \ + VRB_RIGHT(VRB_PARENT(elm, field), field) = (tmp);\ } else \ (head)->rbh_root = (tmp); \ VRB_RIGHT(tmp, field) = (elm); \ @@ -381,15 +381,15 @@ struct { \ #define VRB_PROTOTYPE_STATIC(name, type, field, cmp) \ VRB_PROTOTYPE_INTERNAL(name, type, field, cmp, __unused static) #define VRB_PROTOTYPE_INTERNAL(name, type, field, cmp, attr) \ -attr void name##_VRB_INSERT_COLOR(struct name *, struct type *); \ +attr void name##_VRB_INSERT_COLOR(struct name *, struct type *); \ attr void name##_VRB_REMOVE_COLOR(struct name *, struct type *, struct type *);\ attr struct type *name##_VRB_REMOVE(struct name *, struct type *); \ attr struct type *name##_VRB_INSERT(struct name *, struct type *); \ -attr struct type *name##_VRB_FIND(struct name *, struct type *); \ +attr struct type *name##_VRB_FIND(struct name *, struct type *); \ attr struct type *name##_VRB_NFIND(struct name *, struct type *); \ attr struct type *name##_VRB_NEXT(struct type *); \ attr struct type *name##_VRB_PREV(struct type *); \ -attr struct type *name##_VRB_MINMAX(struct name *, int); \ +attr struct type *name##_VRB_MINMAX(struct name *, int); \ \ /* Main rb operation. @@ -408,7 +408,7 @@ name##_VRB_INSERT_COLOR(struct name *head, struct type *elm) \ VRB_COLOR(parent, field) == VRB_RED) { \ gparent = VRB_PARENT(parent, field); \ if (parent == VRB_LEFT(gparent, field)) { \ - tmp = VRB_RIGHT(gparent, field); \ + tmp = VRB_RIGHT(gparent, field); \ if (tmp && VRB_COLOR(tmp, field) == VRB_RED) { \ VRB_COLOR(tmp, field) = VRB_BLACK; \ VRB_SET_BLACKRED(parent, gparent, field);\ @@ -461,9 +461,9 @@ name##_VRB_REMOVE_COLOR(struct name *head, struct type *parent, struct type *elm VRB_COLOR(VRB_LEFT(tmp, field), field) == VRB_BLACK) &&\ (VRB_RIGHT(tmp, field) == NULL || \ VRB_COLOR(VRB_RIGHT(tmp, field), field) == VRB_BLACK)) {\ - VRB_COLOR(tmp, field) = VRB_RED; \ + VRB_COLOR(tmp, field) = VRB_RED; \ elm = parent; \ - parent = VRB_PARENT(elm, field); \ + parent = VRB_PARENT(elm, field); \ } else { \ if (VRB_RIGHT(tmp, field) == NULL || \ VRB_COLOR(VRB_RIGHT(tmp, field), field) == VRB_BLACK) {\ @@ -471,7 +471,7 @@ name##_VRB_REMOVE_COLOR(struct name *head, struct type *parent, struct type *elm if ((oleft = VRB_LEFT(tmp, field)) \ != NULL) \ VRB_COLOR(oleft, field) = VRB_BLACK;\ - VRB_COLOR(tmp, field) = VRB_RED; \ + VRB_COLOR(tmp, field) = VRB_RED;\ VRB_ROTATE_RIGHT(head, tmp, oleft, field);\ tmp = VRB_RIGHT(parent, field); \ } \ @@ -494,9 +494,9 @@ name##_VRB_REMOVE_COLOR(struct name *head, struct type *parent, struct type *elm VRB_COLOR(VRB_LEFT(tmp, field), field) == VRB_BLACK) &&\ (VRB_RIGHT(tmp, field) == NULL || \ VRB_COLOR(VRB_RIGHT(tmp, field), field) == VRB_BLACK)) {\ - VRB_COLOR(tmp, field) = VRB_RED; \ + VRB_COLOR(tmp, field) = VRB_RED; \ elm = parent; \ - parent = VRB_PARENT(elm, field); \ + parent = VRB_PARENT(elm, field); \ } else { \ if (VRB_LEFT(tmp, field) == NULL || \ VRB_COLOR(VRB_LEFT(tmp, field), field) == VRB_BLACK) {\ @@ -504,7 +504,7 @@ name##_VRB_REMOVE_COLOR(struct name *head, struct type *parent, struct type *elm if ((oright = VRB_RIGHT(tmp, field)) \ != NULL) \ VRB_COLOR(oright, field) = VRB_BLACK;\ - VRB_COLOR(tmp, field) = VRB_RED; \ + VRB_COLOR(tmp, field) = VRB_RED;\ VRB_ROTATE_LEFT(head, tmp, oright, field);\ tmp = VRB_LEFT(parent, field); \ } \ @@ -537,13 +537,13 @@ name##_VRB_REMOVE(struct name *head, struct type *elm) \ while ((left = VRB_LEFT(elm, field)) != NULL) \ elm = left; \ child = VRB_RIGHT(elm, field); \ - parent = VRB_PARENT(elm, field); \ + parent = VRB_PARENT(elm, field); \ color = VRB_COLOR(elm, field); \ if (child) \ VRB_PARENT(child, field) = parent; \ if (parent) { \ if (VRB_LEFT(parent, field) == elm) \ - VRB_LEFT(parent, field) = child; \ + VRB_LEFT(parent, field) = child; \ else \ VRB_RIGHT(parent, field) = child; \ VRB_AUGMENT(parent); \ @@ -571,13 +571,13 @@ name##_VRB_REMOVE(struct name *head, struct type *elm) \ } \ goto color; \ } \ - parent = VRB_PARENT(elm, field); \ + parent = VRB_PARENT(elm, field); \ color = VRB_COLOR(elm, field); \ if (child) \ VRB_PARENT(child, field) = parent; \ if (parent) { \ if (VRB_LEFT(parent, field) == elm) \ - VRB_LEFT(parent, field) = child; \ + VRB_LEFT(parent, field) = child; \ else \ VRB_RIGHT(parent, field) = child; \ VRB_AUGMENT(parent); \ @@ -674,7 +674,7 @@ name##_VRB_NEXT(struct type *elm) \ (elm == VRB_LEFT(VRB_PARENT(elm, field), field))) \ elm = VRB_PARENT(elm, field); \ else { \ - while (VRB_PARENT(elm, field) && \ + while (VRB_PARENT(elm, field) && \ (elm == VRB_RIGHT(VRB_PARENT(elm, field), field)))\ elm = VRB_PARENT(elm, field); \ elm = VRB_PARENT(elm, field); \ @@ -696,7 +696,7 @@ name##_VRB_PREV(struct type *elm) \ (elm == VRB_RIGHT(VRB_PARENT(elm, field), field))) \ elm = VRB_PARENT(elm, field); \ else { \ - while (VRB_PARENT(elm, field) && \ + while (VRB_PARENT(elm, field) && \ (elm == VRB_LEFT(VRB_PARENT(elm, field), field)))\ elm = VRB_PARENT(elm, field); \ elm = VRB_PARENT(elm, field); \ From phk at varnish-cache.org Sat Jun 29 12:53:02 2013 From: phk at varnish-cache.org (Poul-Henning Kamp) Date: Sat, 29 Jun 2013 14:53:02 +0200 Subject: [master] c65d4c5 Use the static version of VRB Message-ID: commit c65d4c5ca0f708ebdc1a2dd2893f899832151607 Author: Poul-Henning Kamp Date: Sat Jun 29 12:51:16 2013 +0000 Use the static version of VRB diff --git a/lib/libvarnishapi/vsl_dispatch.c b/lib/libvarnishapi/vsl_dispatch.c index b74c1d45..51b91af 100644 --- a/lib/libvarnishapi/vsl_dispatch.c +++ b/lib/libvarnishapi/vsl_dispatch.c @@ -149,8 +149,8 @@ vtx_keycmp(const struct vtx_key *a, const struct vtx_key *b) return (0); } -VRB_PROTOTYPE(vtx_tree, vtx_key, entry, vtx_keycmp); -VRB_GENERATE(vtx_tree, vtx_key, entry, vtx_keycmp); +VRB_PROTOTYPE_STATIC(vtx_tree, vtx_key, entry, vtx_keycmp); +VRB_GENERATE_STATIC(vtx_tree, vtx_key, entry, vtx_keycmp); static int vslc_raw_next(void *cursor) From phk at varnish-cache.org Sat Jun 29 12:53:02 2013 From: phk at varnish-cache.org (Poul-Henning Kamp) Date: Sat, 29 Jun 2013 14:53:02 +0200 Subject: [master] e7f6d8f Quieten FlexeLint a bit Message-ID: commit e7f6d8fbe6df6922a477b5e7823a20bdec2d09d1 Author: Poul-Henning Kamp Date: Sat Jun 29 12:52:43 2013 +0000 Quieten FlexeLint a bit diff --git a/bin/varnishtest/flint.lnt b/bin/varnishtest/flint.lnt index 9ff3c68..3b39af7 100644 --- a/bin/varnishtest/flint.lnt +++ b/bin/varnishtest/flint.lnt @@ -26,4 +26,5 @@ -e574 // Signed-unsigned mix with relational -e835 // A zero has been given as ___ argument to operator '___' (<<) -e786 // String concatenation within initializer +-e788 // enum value not used in defaulted switch diff --git a/include/vtree.h b/include/vtree.h index da9d579..3d72da9 100644 --- a/include/vtree.h +++ b/include/vtree.h @@ -381,6 +381,7 @@ struct { \ #define VRB_PROTOTYPE_STATIC(name, type, field, cmp) \ VRB_PROTOTYPE_INTERNAL(name, type, field, cmp, __unused static) #define VRB_PROTOTYPE_INTERNAL(name, type, field, cmp, attr) \ +/*lint -esym(528, name##_VRB_*) */ \ attr void name##_VRB_INSERT_COLOR(struct name *, struct type *); \ attr void name##_VRB_REMOVE_COLOR(struct name *, struct type *, struct type *);\ attr struct type *name##_VRB_REMOVE(struct name *, struct type *); \ From phk at varnish-cache.org Sat Jun 29 12:59:05 2013 From: phk at varnish-cache.org (Poul-Henning Kamp) Date: Sat, 29 Jun 2013 14:59:05 +0200 Subject: [master] f811077 Remove unused #define Message-ID: commit f8110779bfd2acb9555418e595ce52be57259fad Author: Poul-Henning Kamp Date: Sat Jun 29 12:58:57 2013 +0000 Remove unused #define diff --git a/bin/varnishtest/vtc_logexp.c b/bin/varnishtest/vtc_logexp.c index d90fa1e..2d6698d 100644 --- a/bin/varnishtest/vtc_logexp.c +++ b/bin/varnishtest/vtc_logexp.c @@ -103,8 +103,6 @@ struct logexp { pthread_t tp; }; -#define VSL_SLEEP_USEC (50*1000) - static VTAILQ_HEAD(, logexp) logexps = VTAILQ_HEAD_INITIALIZER(logexps); diff --git a/bin/varnishtest/vtc_varnish.c b/bin/varnishtest/vtc_varnish.c index 21bd04b..2c81be1 100644 --- a/bin/varnishtest/vtc_varnish.c +++ b/bin/varnishtest/vtc_varnish.c @@ -78,8 +78,6 @@ struct varnish { #define NONSENSE "%XJEIFLH|)Xspa8P" -#define VSL_SLEEP_USEC (50*1000) - static VTAILQ_HEAD(, varnish) varnishes = VTAILQ_HEAD_INITIALIZER(varnishes); From phk at varnish-cache.org Sat Jun 29 13:38:24 2013 From: phk at varnish-cache.org (Poul-Henning Kamp) Date: Sat, 29 Jun 2013 15:38:24 +0200 Subject: [master] 9850cbf Add lint comments around tables Message-ID: commit 9850cbf457d192269657514e992788c6c2262e9f Author: Poul-Henning Kamp Date: Sat Jun 29 13:36:14 2013 +0000 Add lint comments around tables diff --git a/include/tbl/vsc_levels.h b/include/tbl/vsc_levels.h index 9471a9a..7e235ca 100644 --- a/include/tbl/vsc_levels.h +++ b/include/tbl/vsc_levels.h @@ -35,9 +35,11 @@ * d - Description: Long description of this counter type */ +/*lint -save -e525 -e539 */ VSC_LEVEL_F(info, "INFO", "Informational counters", "Counters giving runtime information") VSC_LEVEL_F(diag, "DIAG", "Diagnostic counters", "Counters giving diagnostic information") VSC_LEVEL_F(debug, "DEBUG", "Debug counters", "Counters giving Varnish internals debug information") +/*lint -restore */ diff --git a/include/tbl/vsc_types.h b/include/tbl/vsc_types.h index 57c631a..9c9d215 100644 --- a/include/tbl/vsc_types.h +++ b/include/tbl/vsc_types.h @@ -40,7 +40,7 @@ * display order in varnishstat. */ - +/*lint -save -e525 -e539 */ VSC_TYPE_F(main, "MAIN", "", "Child", "Child process main counters" ) @@ -62,3 +62,4 @@ VSC_TYPE_F(vbe, "VBE", "VBE", "Backend", VSC_TYPE_F(lck, "LCK", "LCK", "Lock", "Mutex lock counters" ) +/*lint -restore */ From phk at varnish-cache.org Sat Jun 29 13:38:24 2013 From: phk at varnish-cache.org (Poul-Henning Kamp) Date: Sat, 29 Jun 2013 15:38:24 +0200 Subject: [master] 7d17351 Remove unused #includes, const- and statize things. Message-ID: commit 7d17351a4fc50721df0716037305a5f8ee7de394 Author: Poul-Henning Kamp Date: Sat Jun 29 13:38:05 2013 +0000 Remove unused #includes, const- and statize things. diff --git a/lib/libvarnishapi/vsl_arg.c b/lib/libvarnishapi/vsl_arg.c index 22af6a4..c38d63d 100644 --- a/lib/libvarnishapi/vsl_arg.c +++ b/lib/libvarnishapi/vsl_arg.c @@ -43,7 +43,6 @@ #include "miniobj.h" #include "vas.h" -#include "vdef.h" #include "vapi/vsl.h" #include "vapi/vsm.h" @@ -83,7 +82,7 @@ VSL_Name2Tag(const char *name, int l) return (n); } -static const char *vsl_grouping[] = { +static const char * const vsl_grouping[] = { [VSL_g_raw] = "raw", [VSL_g_vxid] = "vxid", [VSL_g_request] = "request", diff --git a/lib/libvarnishapi/vsl_cursor.c b/lib/libvarnishapi/vsl_cursor.c index ca1bce7..7e72e07 100644 --- a/lib/libvarnishapi/vsl_cursor.c +++ b/lib/libvarnishapi/vsl_cursor.c @@ -40,7 +40,6 @@ #include #include "vas.h" -#include "vdef.h" #include "miniobj.h" #include "vapi/vsm.h" #include "vsm_api.h" @@ -212,7 +211,7 @@ vslc_vsm_skip(void *cursor, ssize_t words) return (0); } -static struct vslc_tbl vslc_vsm_tbl = { +static const struct vslc_tbl vslc_vsm_tbl = { .delete = vslc_vsm_delete, .next = vslc_vsm_next, .reset = vslc_vsm_reset, @@ -361,7 +360,7 @@ vslc_file_reset(void *cursor) return (-1); } -static struct vslc_tbl vslc_file_tbl = { +static const struct vslc_tbl vslc_file_tbl = { .delete = vslc_file_delete, .next = vslc_file_next, .reset = vslc_file_reset, diff --git a/lib/libvarnishapi/vsm.c b/lib/libvarnishapi/vsm.c index d7252fc..608443c 100644 --- a/lib/libvarnishapi/vsm.c +++ b/lib/libvarnishapi/vsm.c @@ -44,7 +44,6 @@ #include "miniobj.h" #include "vas.h" -#include "vdef.h" #include "vapi/vsm.h" #include "vapi/vsm_int.h" From phk at varnish-cache.org Sat Jun 29 13:43:59 2013 From: phk at varnish-cache.org (Poul-Henning Kamp) Date: Sat, 29 Jun 2013 15:43:59 +0200 Subject: [master] 4dfedb0 Remove unused detrius. Constify Message-ID: commit 4dfedb014b98b31fb6bde0ccfe310c33ed5b04a7 Author: Poul-Henning Kamp Date: Sat Jun 29 13:40:53 2013 +0000 Remove unused detrius. Constify diff --git a/lib/libvarnishapi/vsl_api.h b/lib/libvarnishapi/vsl_api.h index 989effd..e5a7506 100644 --- a/lib/libvarnishapi/vsl_api.h +++ b/lib/libvarnishapi/vsl_api.h @@ -36,11 +36,7 @@ #define VSL_FILE_ID "VSL" -struct vslc_shmptr { - uint32_t *ptr; - unsigned priv; -}; - +/*lint -esym(534, vsl_diag) */ int vsl_diag(struct VSL_data *vsl, const char *fmt, ...) __printflike(2, 3); int vsl_skip(struct VSL_cursor *c, ssize_t words); @@ -49,7 +45,6 @@ typedef void vslc_delete_f(void *); typedef int vslc_next_f(void *); typedef int vslc_reset_f(void *); typedef int vslc_skip_f(void *, ssize_t words); -typedef int vslc_ref_f(void *, struct vslc_shmptr *ptr); typedef int vslc_check_f(const void *, const struct VSLC_ptr *ptr); struct vslc_tbl { @@ -105,4 +100,4 @@ struct vslq_query; struct vslq_query *vslq_newquery(struct VSL_data *vsl, enum VSL_grouping_e grouping, const char *query); void vslq_deletequery(struct vslq_query **pquery); -int vslq_runquery(struct vslq_query *query, struct VSL_transaction *ptrans[]); +int vslq_runquery(const struct vslq_query *query, struct VSL_transaction * const ptrans[]); diff --git a/lib/libvarnishapi/vsl_query.c b/lib/libvarnishapi/vsl_query.c index 9685486..436d7c9 100644 --- a/lib/libvarnishapi/vsl_query.c +++ b/lib/libvarnishapi/vsl_query.c @@ -89,7 +89,7 @@ vslq_deletequery(struct vslq_query **pquery) } int -vslq_runquery(struct vslq_query *query, struct VSL_transaction *ptrans[]) +vslq_runquery(const struct vslq_query *query, struct VSL_transaction * const ptrans[]) { struct VSL_transaction *t; struct VSL_cursor *c; From phk at varnish-cache.org Sat Jun 29 13:43:59 2013 From: phk at varnish-cache.org (Poul-Henning Kamp) Date: Sat, 29 Jun 2013 15:43:59 +0200 Subject: [master] 7803f56 Remove unused field Message-ID: commit 7803f568a988f58dcf06cc4dd151b79029785a7f Author: Poul-Henning Kamp Date: Sat Jun 29 13:41:18 2013 +0000 Remove unused field diff --git a/bin/varnishtest/vtc_logexp.c b/bin/varnishtest/vtc_logexp.c index 2d6698d..a5cbf86 100644 --- a/bin/varnishtest/vtc_logexp.c +++ b/bin/varnishtest/vtc_logexp.c @@ -85,7 +85,6 @@ struct logexp { char *name; struct vtclog *vl; char run; - char *spec; VTAILQ_HEAD(,logexp_test) tests; struct logexp_test *test; From phk at varnish-cache.org Sat Jun 29 13:43:59 2013 From: phk at varnish-cache.org (Poul-Henning Kamp) Date: Sat, 29 Jun 2013 15:43:59 +0200 Subject: [master] 3c01e85 Accept noncheck of strncpy() return value Message-ID: commit 3c01e858b5250b04cd31a6daa7a1165b1bbaf47c Author: Poul-Henning Kamp Date: Sat Jun 29 13:41:30 2013 +0000 Accept noncheck of strncpy() return value diff --git a/bin/flint.lnt b/bin/flint.lnt index 1f99a4b..53f3238 100644 --- a/bin/flint.lnt +++ b/bin/flint.lnt @@ -91,6 +91,7 @@ -esym(534, strcat) -esym(534, strcpy) -esym(534, strlcpy) +-esym(534, strncpy) +typename(844) -etype(844, struct pthread *) From phk at varnish-cache.org Sat Jun 29 13:43:59 2013 From: phk at varnish-cache.org (Poul-Henning Kamp) Date: Sat, 29 Jun 2013 15:43:59 +0200 Subject: [master] 82e529c General polishing + constification Message-ID: commit 82e529c41c67066507754bb749f7d7db97faf0f7 Author: Poul-Henning Kamp Date: Sat Jun 29 13:42:43 2013 +0000 General polishing + constification diff --git a/lib/libvarnishapi/vsl_dispatch.c b/lib/libvarnishapi/vsl_dispatch.c index 51b91af..592e4b8 100644 --- a/lib/libvarnishapi/vsl_dispatch.c +++ b/lib/libvarnishapi/vsl_dispatch.c @@ -135,7 +135,9 @@ struct VSLQ { unsigned n_cache; }; +/*lint -esym(534, vtx_diag) */ static int vtx_diag(struct VSLQ *vslq, struct vtx *vtx, const char *fmt, ...); +/*lint -esym(534, vtx_diag_tag) */ static int vtx_diag_tag(struct VSLQ *vslq, struct vtx *vtx, const uint32_t *ptr, const char *reason); @@ -149,8 +151,8 @@ vtx_keycmp(const struct vtx_key *a, const struct vtx_key *b) return (0); } -VRB_PROTOTYPE_STATIC(vtx_tree, vtx_key, entry, vtx_keycmp); -VRB_GENERATE_STATIC(vtx_tree, vtx_key, entry, vtx_keycmp); +VRB_PROTOTYPE_STATIC(vtx_tree, vtx_key, entry, vtx_keycmp) +VRB_GENERATE_STATIC(vtx_tree, vtx_key, entry, vtx_keycmp) static int vslc_raw_next(void *cursor) @@ -184,7 +186,7 @@ vslc_raw_reset(void *cursor) return (0); } -static struct vslc_tbl vslc_raw_tbl = { +static const struct vslc_tbl vslc_raw_tbl = { .delete = NULL, .next = vslc_raw_next, .reset = vslc_raw_reset, @@ -242,7 +244,7 @@ vslc_vtx_reset(void *cursor) return (0); } -static struct vslc_tbl vslc_vtx_tbl = { +static const struct vslc_tbl vslc_vtx_tbl = { .delete = NULL, .next = vslc_vtx_next, .reset = vslc_vtx_reset, @@ -281,7 +283,7 @@ vtx_new(struct VSLQ *vslq) vtx->n_descend = 0; (void)vslc_vtx_reset(&vtx->c); vtx->len = 0; - memset(&vtx->chunk, 0, sizeof vtx->chunk); + memset(vtx->chunk, 0, sizeof vtx->chunk); vtx->n_chunk = 0; VTAILQ_INSERT_TAIL(&vslq->incomplete, vtx, list_incomplete); @@ -401,7 +403,7 @@ vtx_buffer(struct VSLQ *vslq, struct vtx *vtx) memcpy(vtx->buf + vtx->chunk[i].offset, vtx->chunk[i].start.ptr, sizeof (uint32_t) * vtx->chunk[i].len); - memset(&vtx->chunk, 0, sizeof vtx->chunk); + memset(vtx->chunk, 0, sizeof vtx->chunk); VTAILQ_REMOVE(&vslq->shmlist, vtx, list_shm); vtx->n_chunk = 0; } @@ -685,7 +687,7 @@ vtx_force(struct VSLQ *vslq, struct vtx *vtx, const char *reason) } static int -vslq_callback(struct VSLQ *vslq, struct vtx *vtx, VSLQ_dispatch_f *func, +vslq_callback(const struct VSLQ *vslq, struct vtx *vtx, VSLQ_dispatch_f *func, void *priv) { unsigned n = vtx->n_descend + 1; @@ -707,7 +709,7 @@ vslq_callback(struct VSLQ *vslq, struct vtx *vtx, VSLQ_dispatch_f *func, return (0); /* Build transaction array */ - vslc_vtx_reset(&vtx->c); + (void)vslc_vtx_reset(&vtx->c); trans[0].level = 1; trans[0].vxid = vtx->key.vxid; trans[0].type = vtx->type; @@ -718,7 +720,7 @@ vslq_callback(struct VSLQ *vslq, struct vtx *vtx, VSLQ_dispatch_f *func, CAST_OBJ_NOTNULL(c, (void *)trans[j].c, VSLC_VTX_MAGIC); VTAILQ_FOREACH(vtx, &c->vtx->child, list_child) { assert(i < n); - vslc_vtx_reset(&vtx->c); + (void)vslc_vtx_reset(&vtx->c); trans[i].level = trans[j].level + 1; trans[i].vxid = vtx->key.vxid; trans[i].type = vtx->type; @@ -842,7 +844,7 @@ VSLQ_Delete(struct VSLQ **pvslq) } static int -vslq_raw(struct VSLQ *vslq, VSLQ_dispatch_f *func, void *priv) +vslq_raw(const struct VSLQ *vslq, VSLQ_dispatch_f *func, void *priv) { struct vslc_raw rawc; struct VSL_transaction trans; From phk at varnish-cache.org Sat Jun 29 13:43:59 2013 From: phk at varnish-cache.org (Poul-Henning Kamp) Date: Sat, 29 Jun 2013 15:43:59 +0200 Subject: [master] d8c4345 Remove unused field Message-ID: commit d8c4345fab47422bd01f0c04e5e5810bdef70a8b Author: Poul-Henning Kamp Date: Sat Jun 29 13:43:06 2013 +0000 Remove unused field diff --git a/lib/libvarnishapi/vsm_api.h b/lib/libvarnishapi/vsm_api.h index 0c2dbd4..afe4c83 100644 --- a/lib/libvarnishapi/vsm_api.h +++ b/lib/libvarnishapi/vsm_api.h @@ -52,7 +52,6 @@ struct VSM_data { double t_ok; struct vsc *vsc; - struct vsl *vsl; }; int vsm_diag(struct VSM_data *vd, const char *fmt, ...) From phk at varnish-cache.org Sat Jun 29 13:43:59 2013 From: phk at varnish-cache.org (Poul-Henning Kamp) Date: Sat, 29 Jun 2013 15:43:59 +0200 Subject: [master] 3d0438f Fix a potential use of an unintialized stack variable. Message-ID: commit 3d0438f7a091b3677a5f614bab5c5f53dab48626 Author: Poul-Henning Kamp Date: Sat Jun 29 13:43:21 2013 +0000 Fix a potential use of an unintialized stack variable. Constification Constification. diff --git a/include/vapi/vsc.h b/include/vapi/vsc.h index ad5ee26..8fa7660 100644 --- a/include/vapi/vsc.h +++ b/include/vapi/vsc.h @@ -59,8 +59,8 @@ int VSC_Arg(struct VSM_data *vd, int arg, const char *opt); * 1 Handled. */ -struct VSC_C_mgt *VSC_Mgt(struct VSM_data *vd, struct VSM_fantom *fantom); -struct VSC_C_main *VSC_Main(struct VSM_data *vd, struct VSM_fantom *fantom); +struct VSC_C_mgt *VSC_Mgt(const struct VSM_data *vd, struct VSM_fantom *fantom); +struct VSC_C_main *VSC_Main(const struct VSM_data *vd, struct VSM_fantom *fantom); /* * Looks up and returns the management stats and the child main * stats structure. If fantom is non-NULL, it can later be used @@ -80,7 +80,7 @@ struct VSC_C_main *VSC_Main(struct VSM_data *vd, struct VSM_fantom *fantom); * non-NULL: Success */ -void *VSC_Get(struct VSM_data *vd, struct VSM_fantom *fantom, const char *type, +void *VSC_Get(const struct VSM_data *vd, struct VSM_fantom *fantom, const char *type, const char *ident); /* * Looks up the given VSC type and identifier. If fantom is diff --git a/include/vapi/vsl.h b/include/vapi/vsl.h index 138e920..daf5366 100644 --- a/include/vapi/vsl.h +++ b/include/vapi/vsl.h @@ -224,7 +224,7 @@ int VSL_Match(struct VSL_data *vsl, const struct VSL_cursor *c); * 0: No match */ -int VSL_Print(struct VSL_data *vsl, const struct VSL_cursor *c, void *fo); +int VSL_Print(const struct VSL_data *vsl, const struct VSL_cursor *c, void *fo); /* * Print the log record pointed to by cursor to stream. * @@ -242,7 +242,7 @@ int VSL_Print(struct VSL_data *vsl, const struct VSL_cursor *c, void *fo); * -5: I/O write error - see errno */ -int VSL_PrintTerse(struct VSL_data *vsl, const struct VSL_cursor *c, void *fo); +int VSL_PrintTerse(const struct VSL_data *vsl, const struct VSL_cursor *c, void *fo); /* * Print the log record pointed to by cursor to stream. * @@ -311,7 +311,7 @@ FILE *VSL_WriteOpen(struct VSL_data *vsl, const char *name, int append, */ -int VSL_Write(struct VSL_data *vsl, const struct VSL_cursor *c, void *fo); +int VSL_Write(const struct VSL_data *vsl, const struct VSL_cursor *c, void *fo); /* * Write the currect record pointed to be c to the FILE* fo * diff --git a/include/vapi/vsm.h b/include/vapi/vsm.h index caddcb9..d4277a4 100644 --- a/include/vapi/vsm.h +++ b/include/vapi/vsm.h @@ -54,6 +54,8 @@ struct VSM_fantom { char ident[VSM_IDENT_LEN]; }; +#define VSM_FANTOM_NULL { 0, 0, 0, 0, {0}, {0}, {0} } + /*--------------------------------------------------------------------- * VSM level access functions */ diff --git a/lib/libvarnishapi/vsc.c b/lib/libvarnishapi/vsc.c index ec4b385..c573fb8 100644 --- a/lib/libvarnishapi/vsc.c +++ b/lib/libvarnishapi/vsc.c @@ -39,7 +39,6 @@ #include "miniobj.h" #include "vas.h" -#include "vdef.h" #include "vapi/vsc.h" #include "vapi/vsm.h" @@ -262,7 +261,7 @@ VSC_Arg(struct VSM_data *vd, int arg, const char *opt) /*--------------------------------------------------------------------*/ struct VSC_C_mgt * -VSC_Mgt(struct VSM_data *vd, struct VSM_fantom *fantom) +VSC_Mgt(const struct VSM_data *vd, struct VSM_fantom *fantom) { return (VSC_Get(vd, fantom, VSC_type_mgt, "")); @@ -271,7 +270,7 @@ VSC_Mgt(struct VSM_data *vd, struct VSM_fantom *fantom) /*--------------------------------------------------------------------*/ struct VSC_C_main * -VSC_Main(struct VSM_data *vd, struct VSM_fantom *fantom) +VSC_Main(const struct VSM_data *vd, struct VSM_fantom *fantom) { return (VSC_Get(vd, fantom, VSC_type_main, "")); @@ -281,10 +280,10 @@ VSC_Main(struct VSM_data *vd, struct VSM_fantom *fantom) */ void * -VSC_Get(struct VSM_data *vd, struct VSM_fantom *fantom, const char *type, +VSC_Get(const struct VSM_data *vd, struct VSM_fantom *fantom, const char *type, const char *ident) { - struct VSM_fantom f2; + struct VSM_fantom f2 = VSM_FANTOM_NULL; if (fantom == NULL) fantom = &f2; @@ -321,9 +320,10 @@ vsc_add_vf(struct vsc *vsc, const struct VSM_fantom *fantom, VTAILQ_INSERT_TAIL(&vsc->vf_list, vf, list); } +/*lint -esym(528, vsc_add_pt) */ static void vsc_add_pt(struct vsc *vsc, const volatile void *ptr, - const struct VSC_desc *desc, struct vsc_vf *vf) + const struct VSC_desc *desc, const struct vsc_vf *vf) { struct vsc_pt *pt; @@ -347,7 +347,7 @@ vsc_add_pt(struct vsc *vsc, const volatile void *ptr, CHECK_OBJ_NOTNULL(vsc, VSC_MAGIC); \ st = vf->fantom.b; -#define VSC_F(nn,tt,ll,ff,vv,dd,ee) \ +#define VSC_F(nn,tt,ll,ff,vv,dd,ee) \ vsc_add_pt(vsc, &st->nn, descs++, vf); #define VSC_DONE(U,l,t) \ diff --git a/lib/libvarnishapi/vsl.c b/lib/libvarnishapi/vsl.c index 501e19a..d7a5fa8 100644 --- a/lib/libvarnishapi/vsl.c +++ b/lib/libvarnishapi/vsl.c @@ -48,7 +48,6 @@ #include "vapi/vsm.h" #include "vapi/vsl.h" #include "vapi/vsm_int.h" -#include "vin.h" #include "vbm.h" #include "vmb.h" #include "vre.h" @@ -154,7 +153,7 @@ VSL_ResetError(struct VSL_data *vsl) } static int -vsl_match_IX(struct VSL_data *vsl, vslf_list *list, const struct VSL_cursor *c) +vsl_match_IX(struct VSL_data *vsl, const vslf_list *list, const struct VSL_cursor *c) { enum VSL_tag_e tag; const char *cdata; @@ -202,7 +201,7 @@ VSL_Match(struct VSL_data *vsl, const struct VSL_cursor *c) return (1); } -const char *VSL_transactions[256] = { +static const char * const VSL_transactions[256] = { /* 12345678901234 */ [VSL_t_unknown] = "<< Unknown >>", [VSL_t_sess] = "<< Session >>", @@ -219,7 +218,7 @@ const char *VSL_transactions[256] = { } while (0) int -VSL_Print(struct VSL_data *vsl, const struct VSL_cursor *c, void *fo) +VSL_Print(const struct VSL_data *vsl, const struct VSL_cursor *c, void *fo) { enum VSL_tag_e tag; uint32_t vxid; @@ -257,7 +256,7 @@ VSL_Print(struct VSL_data *vsl, const struct VSL_cursor *c, void *fo) } int -VSL_PrintTerse(struct VSL_data *vsl, const struct VSL_cursor *c, void *fo) +VSL_PrintTerse(const struct VSL_data *vsl, const struct VSL_cursor *c, void *fo) { enum VSL_tag_e tag; unsigned len; @@ -390,7 +389,7 @@ VSL_WriteOpen(struct VSL_data *vsl, const char *name, int append, int unbuf) } int -VSL_Write(struct VSL_data *vsl, const struct VSL_cursor *c, void *fo) +VSL_Write(const struct VSL_data *vsl, const struct VSL_cursor *c, void *fo) { size_t r; From phk at varnish-cache.org Sat Jun 29 13:39:21 2013 From: phk at varnish-cache.org (Poul-Henning Kamp) Date: Sat, 29 Jun 2013 15:39:21 +0200 Subject: [master] 94a8671 Constify, put AN() the right place. Message-ID: commit 94a8671e9124922af2e26ad24401f4c478f204de Author: Poul-Henning Kamp Date: Sat Jun 29 13:39:05 2013 +0000 Constify, put AN() the right place. diff --git a/include/vtree.h b/include/vtree.h index 3d72da9..cf4e061 100644 --- a/include/vtree.h +++ b/include/vtree.h @@ -386,11 +386,11 @@ attr void name##_VRB_INSERT_COLOR(struct name *, struct type *); \ attr void name##_VRB_REMOVE_COLOR(struct name *, struct type *, struct type *);\ attr struct type *name##_VRB_REMOVE(struct name *, struct type *); \ attr struct type *name##_VRB_INSERT(struct name *, struct type *); \ -attr struct type *name##_VRB_FIND(struct name *, struct type *); \ -attr struct type *name##_VRB_NFIND(struct name *, struct type *); \ +attr struct type *name##_VRB_FIND(const struct name *, const struct type *); \ +attr struct type *name##_VRB_NFIND(const struct name *, const struct type *); \ attr struct type *name##_VRB_NEXT(struct type *); \ attr struct type *name##_VRB_PREV(struct type *); \ -attr struct type *name##_VRB_MINMAX(struct name *, int); \ +attr struct type *name##_VRB_MINMAX(const struct name *, int); \ \ /* Main rb operation. @@ -451,6 +451,7 @@ name##_VRB_REMOVE_COLOR(struct name *head, struct type *parent, struct type *elm struct type *tmp; \ while ((elm == NULL || VRB_COLOR(elm, field) == VRB_BLACK) && \ elm != VRB_ROOT(head)) { \ + AN(parent); \ if (VRB_LEFT(parent, field) == elm) { \ tmp = VRB_RIGHT(parent, field); \ if (VRB_COLOR(tmp, field) == VRB_RED) { \ @@ -586,7 +587,6 @@ name##_VRB_REMOVE(struct name *head, struct type *elm) \ VRB_ROOT(head) = child; \ color: \ if (color == VRB_BLACK) { \ - AN(parent); \ name##_VRB_REMOVE_COLOR(head, parent, child); \ } \ return (old); \ @@ -625,7 +625,7 @@ name##_VRB_INSERT(struct name *head, struct type *elm) \ \ /* Finds the node with the same key as elm */ \ attr struct type * \ -name##_VRB_FIND(struct name *head, struct type *elm) \ +name##_VRB_FIND(const struct name *head, const struct type *elm) \ { \ struct type *tmp = VRB_ROOT(head); \ int comp; \ @@ -643,7 +643,7 @@ name##_VRB_FIND(struct name *head, struct type *elm) \ \ /* Finds the first node greater than or equal to the search key */ \ attr struct type * \ -name##_VRB_NFIND(struct name *head, struct type *elm) \ +name##_VRB_NFIND(const struct name *head, const struct type *elm) \ { \ struct type *tmp = VRB_ROOT(head); \ struct type *res = NULL; \ @@ -707,7 +707,7 @@ name##_VRB_PREV(struct type *elm) \ } \ \ attr struct type * \ -name##_VRB_MINMAX(struct name *head, int val) \ +name##_VRB_MINMAX(const struct name *head, int val) \ { \ struct type *tmp = VRB_ROOT(head); \ struct type *parent = NULL; \