From sollie at sparkz.no Sun Apr 9 11:03:17 2017 From: sollie at sparkz.no (sollie) Date: Sun, 9 Apr 2017 07:03:17 -0400 Subject: =?utf-8?B?dmVyeSBnb29kIG5ld3M=?= Message-ID: <1281160552.20170409140317@sparkz.no> Hey, I've got good news for you, you'll be surprised)) Please, read more about it here http://lifeinbalance.co.uk/summit.php?4a4b Take care, sollie -------------- next part -------------- An HTML attachment was scrubbed... URL: From slink at schokola.de Tue Apr 11 13:47:13 2017 From: slink at schokola.de (Nils Goroll) Date: Tue, 11 Apr 2017 15:47:13 +0200 Subject: sub probe_resp - VIP RFC Message-ID: <33e3855e-711f-87ba-648e-2a4f86bb663b@schokola.de> Hi, https://github.com/varnishcache/varnish-cache/wiki/Varnish-Improvement-Proposals : "The VIP procedure starts with a discussions on varnish-dev" I had skipped this bit for previous suggestions, but this time I want to get it right. I'd like to propose the following: # Synopsis Add support for calling vcl subs on the response of backend probes. # Why? Add a way to manipulate backends not just on the basis of a binary health check result, but optionally also on the basis of other information returned by the backend with the probe response. One example would be to dynamically change the weight of backends based on a load metric returned with the probe response, which will also require vmod_directors to support changing the weight dynamically. # How? * Add VCC support to register VCL subs with probes * Add a default vcl_probe_resp sub to the builtin.vcl which implements the current behavior. Probes without an explicit response sub definition will use vcl_probe_resp * in probe_resp context, make the following objects available - analogous to beresp - prresp.backend - prresp.http.* - prresp.proto - prresp.status - prresp.reason - later? - prresp.body By design, all access should be read-only, but we might want to have all but .backend writable for practical reasons (writes having no effect other than being visible in the sub probe_resp) - probe.* attributes of the probe (read-only) - probe.name - probe.expected_response - probe.timeout - probe.interval - probe.initial - probe.window - probe.threshold * a probe_resp sub may return with the following healthy sick * The default vcl_probe_resp: sub vcl_probe_resp { if (prresp.proto ~ "^HTTP/\d+\.\d+$" && prresp.status == probe.expected_response) { return (healthy); } return (sick); } * Toy example for the use case mentioned above (needs more changes) sub probe_weight { if (prresp.http.X-Load ~ "^\d+\.\d+$") { my_rr.set_weight(prresp.backend, 1 / std.real(req.http.X-Load, 1.0)); } } From dridi at varni.sh Tue Apr 11 14:09:09 2017 From: dridi at varni.sh (Dridi Boukelmoune) Date: Tue, 11 Apr 2017 16:09:09 +0200 Subject: sub probe_resp - VIP RFC In-Reply-To: <33e3855e-711f-87ba-648e-2a4f86bb663b@schokola.de> References: <33e3855e-711f-87ba-648e-2a4f86bb663b@schokola.de> Message-ID: > # Why? No opinion on the why, I get the point and I see the benefits but no opinion. It also leaves out the non-VBE backends, although as of today I m only aware of Martin's fsbackend implementation of an out-of-tree backend. > * Add a default vcl_probe_resp sub to the builtin.vcl which implements the > current behavior. Probes without an explicit response sub definition will > use vcl_probe_resp I'd rather have vcl_probe_response, akin to existing v_b_r. > * in probe_resp context, make the following objects available > > - analogous to beresp > > - prresp.backend > - prresp.http.* > - prresp.proto > - prresp.status > - prresp.reason Why not beresp? It's a response from the backend, I don't think reusing the name would be confusing. > * a probe_resp sub may return with the following > > healthy > sick Or we could reuse "ok" and the universal "fail" like in vcl_init. > * The default vcl_probe_resp: > > sub vcl_probe_resp { > if (prresp.proto ~ "^HTTP/\d+\.\d+$" && ^HTTP/1\.[01]$ Dridi From slink at schokola.de Tue Apr 11 14:31:15 2017 From: slink at schokola.de (Nils Goroll) Date: Tue, 11 Apr 2017 16:31:15 +0200 Subject: sub probe_resp - VIP RFC In-Reply-To: References: <33e3855e-711f-87ba-648e-2a4f86bb663b@schokola.de> Message-ID: <1d2750f7-ac47-9ec5-f2e7-17d16581f9c1@schokola.de> Dridi, thank you for the quick feedback. I'm fine with your suggestions, except: >> * The default vcl_probe_resp: >> >> sub vcl_probe_resp { >> if (prresp.proto ~ "^HTTP/\d+\.\d+$" && > > ^HTTP/1\.[01]$ This was intended to be equivalent to the current code: i = sscanf(vt->resp_buf, "HTTP/%*f %u ", &resp); From dridi at varni.sh Tue Apr 11 14:47:02 2017 From: dridi at varni.sh (Dridi Boukelmoune) Date: Tue, 11 Apr 2017 16:47:02 +0200 Subject: sub probe_resp - VIP RFC In-Reply-To: <1d2750f7-ac47-9ec5-f2e7-17d16581f9c1@schokola.de> References: <33e3855e-711f-87ba-648e-2a4f86bb663b@schokola.de> <1d2750f7-ac47-9ec5-f2e7-17d16581f9c1@schokola.de> Message-ID: On Tue, Apr 11, 2017 at 4:31 PM, Nils Goroll wrote: > Dridi, > > thank you for the quick feedback. I'm fine with your suggestions, except: > >>> * The default vcl_probe_resp: >>> >>> sub vcl_probe_resp { >>> if (prresp.proto ~ "^HTTP/\d+\.\d+$" && >> >> ^HTTP/1\.[01]$ > > This was intended to be equivalent to the current code: > > i = sscanf(vt->resp_buf, "HTTP/%*f %u ", &resp); Considering that we only support 1.0 and 1.1 today, I believe we would get a more accurate result with this regex instead of a clunky scanf format. If we go for vcl_probe_response we might as well improve the parsing of the response, why not? Cheers, Dridi From slink at schokola.de Wed Apr 12 05:26:52 2017 From: slink at schokola.de (Nils Goroll) Date: Wed, 12 Apr 2017 07:26:52 +0200 Subject: sub probe_resp - VIP RFC In-Reply-To: References: <33e3855e-711f-87ba-648e-2a4f86bb663b@schokola.de> <1d2750f7-ac47-9ec5-f2e7-17d16581f9c1@schokola.de> Message-ID: <2702efa9-d592-707a-dc45-36c53bb3342b@schokola.de> On 11/04/17 16:47, Dridi Boukelmoune wrote: > If we go for vcl_probe_response we might as well improve the parsing > of the response, why not? Agreed, this is an easy additional step once we got the change in place. From slink at schokola.de Wed Apr 12 05:35:18 2017 From: slink at schokola.de (Nils Goroll) Date: Wed, 12 Apr 2017 07:35:18 +0200 Subject: update1: sub probe_resp - VIP RFC In-Reply-To: <33e3855e-711f-87ba-648e-2a4f86bb663b@schokola.de> References: <33e3855e-711f-87ba-648e-2a4f86bb663b@schokola.de> Message-ID: <89511585-58c2-bc0e-b8f6-100d281bcfbd@schokola.de> first update based on feedback from Dridi Hi, https://github.com/varnishcache/varnish-cache/wiki/Varnish-Improvement-Proposals : "The VIP procedure starts with a discussions on varnish-dev" I had skipped this bit for previous suggestions, but this time I want to get it right. I'd like to propose the following: # Synopsis Add support for calling vcl subs on the response of backend probes. # Why? Add a way to manipulate backends not just on the basis of a binary health check result, but optionally also on the basis of other information returned by the backend with the probe response. One example would be to dynamically change the weight of backends based on a load metric returned with the probe response, which will also require vmod_directors to support changing the weight dynamically. # How? * Add VCC support to register VCL subs with probes * Add a default vcl_probe_response sub to the builtin.vcl which implements the current behavior. Probes without an explicit response sub definition will use vcl_probe_response * in probe response vcl context, make the following objects available - analogous to beresp - beresp.backend - beresp.http.* - beresp.proto - beresp.status - beresp.reason - later? - beresp.body By design, all access should be read-only, but we might want to have all but .backend writable for practical reasons (writes having no effect other than being visible in the sub probe_resp) - probe.* attributes of the probe (read-only) - probe.name - probe.expected_response - probe.timeout - probe.interval - probe.initial - probe.window - probe.threshold * a probe response vcl sub may return with the following ok fail * The default vcl_probe_response: sub vcl_probe_response { if (beresp.proto ~ "^HTTP/\d+\.\d+$" && beresp.status == probe.expected_response) { return (fail); } return (ok); } this matches the existing implementation, we might want to change the first condition to beresp.proto ~ "^HTTP/1\.[01]$" once this is in place * Toy example for the use case mentioned above (needs more changes) sub probe_weight { if (beresp.http.X-Load ~ "^\d+\.\d+$") { my_rr.set_weight(beresp.backend, 1 / std.real(req.http.X-Load, 1.0)); } } From phk at phk.freebsd.dk Sun Apr 16 09:06:52 2017 From: phk at phk.freebsd.dk (Poul-Henning Kamp) Date: Sun, 16 Apr 2017 09:06:52 +0000 Subject: update1: sub probe_resp - VIP RFC In-Reply-To: <89511585-58c2-bc0e-b8f6-100d281bcfbd@schokola.de> References: <33e3855e-711f-87ba-648e-2a4f86bb663b@schokola.de> <89511585-58c2-bc0e-b8f6-100d281bcfbd@schokola.de> Message-ID: <50346.1492333612@critter.freebsd.dk> -------- In message <89511585-58c2-bc0e-b8f6-100d281bcfbd at schokola.de>, Nils Goroll writ es: I notice you do not make anything available about the backend, only about the probe and the response ? Is that deliberate or accidental ? -- Poul-Henning Kamp | UNIX since Zilog Zeus 3.20 phk at FreeBSD.ORG | TCP/IP since RFC 956 FreeBSD committer | BSD since 4.3-tahoe Never attribute to malice what can adequately be explained by incompetence. From phk at phk.freebsd.dk Sun Apr 16 17:17:15 2017 From: phk at phk.freebsd.dk (Poul-Henning Kamp) Date: Sun, 16 Apr 2017 17:17:15 +0000 Subject: update1: sub probe_resp - VIP RFC In-Reply-To: References: <33e3855e-711f-87ba-648e-2a4f86bb663b@schokola.de> <89511585-58c2-bc0e-b8f6-100d281bcfbd@schokola.de> <50346.1492333612@critter.freebsd.dk> Message-ID: <1428.1492363035@critter.freebsd.dk> -------- In message , Dridi Boukelmoune writes: >I notice you do not make anything available about the backend, >only about the probe and the response ? > >Is that deliberate or accidental ? > > >Did you miss the beresp.backend maybe ? No, I did not. I was wonder if more intimate exposure was considered or ignored ? -- Poul-Henning Kamp | UNIX since Zilog Zeus 3.20 phk at FreeBSD.ORG | TCP/IP since RFC 956 FreeBSD committer | BSD since 4.3-tahoe Never attribute to malice what can adequately be explained by incompetence. From slink at schokola.de Mon Apr 17 15:20:58 2017 From: slink at schokola.de (Nils Goroll) Date: Mon, 17 Apr 2017 17:20:58 +0200 Subject: update1: sub probe_resp - VIP RFC In-Reply-To: <1428.1492363035@critter.freebsd.dk> References: <33e3855e-711f-87ba-648e-2a4f86bb663b@schokola.de> <89511585-58c2-bc0e-b8f6-100d281bcfbd@schokola.de> <50346.1492333612@critter.freebsd.dk> <1428.1492363035@critter.freebsd.dk> Message-ID: <434280ee-a081-87f6-278c-5156cae6d7e3@schokola.de> On 16/04/17 19:17, Poul-Henning Kamp wrote: >> Did you miss the beresp.backend maybe ? > No, I did not. > > I was wonder if more intimate exposure was considered or ignored ? The whole point of the VIP RFC is to modify the backend, and I intended to add vmod functions later to (indirectly) manipulate the BACKEND object available in the sub. I had given this example: On 12/04/17 07:35, Nils Goroll wrote: > * Toy example for the use case mentioned above (needs more changes) > > sub probe_weight { > if (beresp.http.X-Load ~ "^\d+\.\d+$") { > my_rr.set_weight(beresp.backend, > 1 / std.real(req.http.X-Load, 1.0)); > } > } the vmod code would check if the BACKEND object passed is contained in my_rr. Nils From phk at phk.freebsd.dk Mon Apr 17 18:27:07 2017 From: phk at phk.freebsd.dk (Poul-Henning Kamp) Date: Mon, 17 Apr 2017 18:27:07 +0000 Subject: update1: sub probe_resp - VIP RFC In-Reply-To: <434280ee-a081-87f6-278c-5156cae6d7e3@schokola.de> References: <33e3855e-711f-87ba-648e-2a4f86bb663b@schokola.de> <89511585-58c2-bc0e-b8f6-100d281bcfbd@schokola.de> <50346.1492333612@critter.freebsd.dk> <1428.1492363035@critter.freebsd.dk> <434280ee-a081-87f6-278c-5156cae6d7e3@schokola.de> Message-ID: <5995.1492453627@critter.freebsd.dk> -------- In message <434280ee-a081-87f6-278c-5156cae6d7e3 at schokola.de>, Nils Goroll writes: >On 16/04/17 19:17, Poul-Henning Kamp wrote: >>> Did you miss the beresp.backend maybe ? >> No, I did not. >> >> I was wonder if more intimate exposure was considered or ignored ? > >The whole point of the VIP RFC is to modify the backend, and I intended to add >vmod functions later to (indirectly) manipulate the BACKEND object available in >the sub. So this is where I think all the dragons will be found: Doesn't the backend implementation get any say in this? Summary: I'm OK with the idea, but we need to find out how this works with dynamic backends, in particular dynamic backends speaking FOOPROTO rather than HTTP. -- Poul-Henning Kamp | UNIX since Zilog Zeus 3.20 phk at FreeBSD.ORG | TCP/IP since RFC 956 FreeBSD committer | BSD since 4.3-tahoe Never attribute to malice what can adequately be explained by incompetence. From dridi at varni.sh Tue Apr 18 08:25:56 2017 From: dridi at varni.sh (Dridi Boukelmoune) Date: Tue, 18 Apr 2017 10:25:56 +0200 Subject: update1: sub probe_resp - VIP RFC In-Reply-To: <5995.1492453627@critter.freebsd.dk> References: <33e3855e-711f-87ba-648e-2a4f86bb663b@schokola.de> <89511585-58c2-bc0e-b8f6-100d281bcfbd@schokola.de> <50346.1492333612@critter.freebsd.dk> <1428.1492363035@critter.freebsd.dk> <434280ee-a081-87f6-278c-5156cae6d7e3@schokola.de> <5995.1492453627@critter.freebsd.dk> Message-ID: > So this is where I think all the dragons will be found: Doesn't the > backend implementation get any say in this? I touched on that in the first thread. As of today probes only exist for VBE backends. > Summary: I'm OK with the idea, but we need to find out how this works > with dynamic backends, in particular dynamic backends speaking FOOPROTO > rather than HTTP. There is no probing mechanism for FOO backends/director, unless we abstract the probes to make them usable with arbitrary implementations. But I don't reckon the current behavior would be a one-size-fits-all for other types of backends. Dridi From geoff at uplex.de Mon Apr 24 11:53:11 2017 From: geoff at uplex.de (Geoff Simmons) Date: Mon, 24 Apr 2017 13:53:11 +0200 Subject: RFC for VIP17: unix domain sockets for listen and backend addresses Message-ID: <427ae252-6daa-d5ba-c7a5-f0b513ef1f2f@uplex.de> By request at bugwash, this is to open a thread for commentary about VIP17: https://github.com/varnishcache/varnish-cache/wiki/VIP-17%3A-Enable-Unix-domain-sockets-for-listen-and-backend-addresses The phrase "comments welcome" appears in a number of places in the text where I thought that there's a need for contemplation and consensus. Comments are welcome about any part of it at all, of course. Thanks, Geoff -- ** * * UPLEX - Nils Goroll Systemoptimierung Scheffelstra?e 32 22301 Hamburg Tel +49 40 2880 5731 Mob +49 176 636 90917 Fax +49 40 42949753 http://uplex.de -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 819 bytes Desc: OpenPGP digital signature URL: From dridi at varni.sh Tue Apr 25 13:35:37 2017 From: dridi at varni.sh (Dridi Boukelmoune) Date: Tue, 25 Apr 2017 15:35:37 +0200 Subject: [RFC] IP-less ACLs (technically named listen addresses) Message-ID: Hello everyone, This is an idea that has been rejected once, I think before we had the VIP process in place. PHK agreed to revisit this feature request in exchange of a thorough specification, which is the object of this thread. It also relates to VIP 17 (unix domain sockets support) and I will mention why too. I will however also comment on the VIP 17 thread and don't wish to discuss VIP 17 here. # Current status We already have mechanisms available in VCL to restrict or harden part of a cache policy. For example, one can (and is strongly encouraged) to prevent anyone from performing cache invalidation. ## ban via the varnish-cli I'm mentioning this because I find interesting how a combination of factors enables access control. To use the ban outside of VCL, you need to connect to an administration socket. However with our default packaging this socket is bound to the loopback and you are welcome with a challenge that requires knowing a shared secret. This shared secret is stored in a file accessible only to root. We virtually have an ACL in this case: you can only perform such a ban if you have local privileges on the system (by default). This is of course true for any operation via the varnish-cli. ## client.ip + acl Probably the most common construct among varnish users. Illustrated in the vcl(7) manual. ## server.ip + acl If varnishd is listening to different network interfaces (multiple -a options) you can instead let the "network hardening" happen outside of Varnish and allow operations when performed on an authorized/relevant address. Example for varnishd -a 1.2.3.4 -a 5.6.7.8 [...non -a opts...]: acl admin { "6.0.0.0"/8; } sub vcl_recv { if (req.method == "PURGE") { if (server.ip ~ admin) { return (purge); } else { return (synth(405)); } } } ## server.ip + std.port It's worth mentioning at this point that the VCL IP type (VCL_IP in C) contains both an IP address and a port number. To my knowledge ACLs only match the IP address. This is a variant of the previous use case, where the port is discriminant instead of the IP address. Once again leaving the hardening as someone else's problem. Let's say typically a team of network folks (firewall rules and whatnot). Example for varnishd -a :80 -a :9080 [...non -a opts...]: sub vcl_recv { if (req.method == "PURGE") { if (std.port(server.ip) == 8080) { return (purge); } else { return (synth(405)); } } } ## req.* You may rely on a cookie-based, token-based or anything coming from the HTTP request to restrict an operation. VCL provides no specific construct for that besides access to the headers and pseudo headers of the request. sub vcl_recv { if (req.method == "PURGE") { if (req.http.some-header == "some psk") { return (purge); } else { return (synth(405)); } } } # A note on PROXY protocol You may prefer the remote&local variables over client&server depending on your use case and/or for security reasons. # IP-less ACLs Or more accurately transport-independent ACLs, decoupled from IP addresses or port numbers in VCL. In the case of VIP 17, it would also be applicable if varnishd were to listen to a unix domain socket. ## How? Give a name to listen addresses that can be used in VCL to make decisions. The name is passed in the -a optarg, ideally with the same syntax as storage backends: varnishd -a public=1.2.3.4 -a admin=5.6.7.8 varnishd -a public=:80 -a admin=:9080 You may change how varnishd is deployed without having to change the VCL, provided that the names remain between varnishd instances (much like storage backends). An alternate syntax in case '=' is problematic: varnishd -a :80,HTTP/1,public -a :9080,HTTP/1,admin The latter forces you to pick a protocol if you wish to name a listen address. Addresses not explicitly named are called "a0", "a1" and so on (like "s0", "s1" etc for storage backends). A new VCL_LISTEN_ADDRESS [1] type is introduced. Each listen address has a symbol of this type, mapped to its name from the command line. A new `local.listen_address` [2] variable is introduced and can be used for access control. Example for varnishd -a public=:80 -a admin=:9080 [...non -a opts...]: sub vcl_recv { if (req.method == "PURGE") { if (local.listen_address == admin) { return (purge); } else { return (synth(405)); } } } ## Why The name is abstract and makes the VCL more portable in environments where varnishd instances are scheduled dynamically with unpredictable IP addresses or port numbers (without having to pre-process VCL). Like `storage_hint` vs `storage`, having a type-checked symbol in VCL removes the risk of not matching the right thing. Did you notice that both my acl check and port check in the previous example had typos in them that would _not_ prevent VCL from compiling? This abstraction is also future proof, as new types of listen addresses like unix socket domains could also be named and used as such. ## Testing Add new macros in varnishtest to make use of named listen addresses. Example for varnish v1 [...] -start: ${v1_addr} for the first address, same as today ${v1_addr_a0} for the first address [3] ${v1_addr_*} for additional addresses Possibly a `varnish v1 -addr ` used at launch time to easily bind additional random ports (like -arg or -jail that can only be used before the manager is started). # A note on security This does not improve nor worsen security, AFAICT. I would argue that a type-checked `local.listen_address` slightly improves things in this area. Unlike an ACL, it has to hard-match something from the command line and typos are less likely to occur (you could always confuse two names but you could also mix up two ACLs). Besides this tiny point, transport-independent ACLs only move the problem outside of Varnish. For example, a compromised host may perform restricted operations if its IP matches an ACL or if the compromised host has access to a privileged local.ip and relying on type-checked names doesn't solve that. # Closing words This in essence very similar to the varnish-cli pseudo ACL, sort of restricting access to the root user with our packages default configuration. You could even remove the authorization challenge if the connection happened on a unix domain socket and ugo permissions were enough. Cheers, Dridi [1] akin to VCL_STEVEDORE [2] all names for types or variables are only suggestions [3] because the -a arg is hard-coded anyway From dridi at varni.sh Tue Apr 25 13:46:26 2017 From: dridi at varni.sh (Dridi Boukelmoune) Date: Tue, 25 Apr 2017 15:46:26 +0200 Subject: RFC for VIP17: unix domain sockets for listen and backend addresses In-Reply-To: <427ae252-6daa-d5ba-c7a5-f0b513ef1f2f@uplex.de> References: <427ae252-6daa-d5ba-c7a5-f0b513ef1f2f@uplex.de> Message-ID: On Mon, Apr 24, 2017 at 1:53 PM, Geoff Simmons wrote: > By request at bugwash, this is to open a thread for commentary about > VIP17: > > https://github.com/varnishcache/varnish-cache/wiki/VIP-17%3A-Enable-Unix-domain-sockets-for-listen-and-backend-addresses Thanks for the thread, I will alow myself to copy the VIP in its current state here before I comment because I have noticed at least 2 updates in the wiki before this threead started. # Synopsis Allow unix domain sockets as a listen address for Varnish (``-a`` option) and as addresses for backends, and for use in ACLs. Obtain credentials of the peer process connected on a UDS, such as uid and gid, for use in VCL. # Why? * Eliminate the overhead of TCP/loopback for connections with peers that are colocated on a host with Varnish. * Restrict who can send requests to Varnish by setting permissions on the UDS path of the listen address. * (But see the discussion below about getting this right portably.) * Make it possible for a backend peer to require restricted credentials for the Varnish process by setting permissions on the UDS path on which it listens. * Peer credentials make it possible to: * Make information about the peer available in VCL and the log. * Extend ACLs to make it possible to place further restrictions on peers connecting to the listen address. An obvious application is the use of SSL offloaders connecting to the listen address, and SSL "onloaders" as backends. UDS would eliminate the TCP overhead, and the ability to restrict the credentials of peers mitigates the risks of man-in-the-middle. Both haproxy and nginx/ProxyPass, among others, support UDS addresses in "both directions", so they are candidates for this purpose. A notable exception is hitch, which currently only supports TCP connections. I would be happy to help the hitch project support UDS (shouldn't be hard at all). I would like to make this contribution for the September 2017 release. With the VIP I'd like to clarify: * Are there any changes planned for VTCP and VSA in the September release that would make adding UDS to those interfaces less trivial than it is now? * Every platform has a way to get peer credentials from a UDS, but there's no standard and it's highly platform-dependent. So how do we want to handle that? * Additions/changes to VCL and other changes in naming, such as the ``-a`` option and backend definitions. * If someone knows a reason why we shouldn't do this at all, this is the place to say so. # How? ## Address notation I suggest that we require a prefix such as ``unix:`` to identify UDS addresses (nginx uses ``unix:``, haproxy uses ``unix@``): ``` varnishd -a unix:/path/to/uds backend uds { .host = "unix:/path/to/uds"; } ``` That makes the interpretation unambiguous. We could simply interpret paths as UDS addresses when they appear in those places, but then we would need logic like: if the argument cannot be resolved as a host or parsed as an IP address, then assume it's a path for UDS, but if the path does not exist or cannot be accessed, then fail. So better to just make it unambiguous. Parsing UDS addresses would be an extension of ``VSS_Resolver``. The name ``.host`` in a backend definition becomes a bit peculiar if its value can also be a UDS (we will see a number or examples like this). We could: * stay with the name ``.host``, and document the fact that it might not identify a host in some cases * replace ``.host`` with a name like ``.peer``, sacrificing backward compatibility * introduce ``.peer``, retain ``.host`` as a deprecated alias, and remove ``.host`` in a future release I suggest the last option, comments welcome. ``.port`` in a backend definition is already optional, and is unnecessary for a UDS. Should it be an error to specify a port when a UDS is specified, or should it be ignored? Comments welcome. ## Access permissions on the listen address For the ``-a`` address, I suggest an optional means of specifying who can access the UDS: ``` varnishd -a unix:/path/to/uds:uid=foo,gid=bar ``` There's an issue here in that the separator (``:`` in the example) could not appear in any UDS path. We might just have to forbid a certain character in UDS paths. Fortunately we don't have a such a problem with backend addresses (which are generated by another server, so we have less freedom to impose restrictions on the path names). ``uid`` and ``gid`` can be specified as numeric or with names. Either, both or none of uid and gid would be permitted. Enforcing access permissions would be tricky to get right portably and reliably (and might just not work). From what I surmise at the moment (and I might be quite wrong): * Ownership would have to set on the directory containing the UDS -- ``/path/to/`` in the example. * BSD-derived systems do not restrict connects to the UDS itself due to its permissions (or so I've read). But you can make a UDS inaccessible to a process that can't read its directory. * Then chmod the directory to 0700 or 0770, depending on whether access is set for user and/or group. * This should be done before bind, creating the directory if necessary. * On Linux, peers connecting to the UDS must have read/write permission, so we would also set uid/gid ownership on the UDS and set permissions to 0600 or 0660, as the case may be. Might as well do that on every platform. * Must be done after bind and before listen. * ``mgt_acceptor.c`` would do all of this. Typically the management process runs as root and is able to change permissions and ownership; if the management process owner can't do these things, then varnishd fails to start. So the sequence for the management process would be (again, unless I'm getting this all wrong): * create the directory if necessary * if access restrictions were requested then set uid/gid and permissions on the directory accordingly * bind (note that ``VTCP_bind`` will have to unlink the before before bind for a UDS, if the path already exists) * set permissions on the UDS, at least read/write in all cases, and set ownership if requested Then the socket can be handed off to the child process for listen. If no access restrictions were requested, then don't manipulate ownership, let bind create the UDS, and set its permissions to 0666. Comments and corrections on this section are very much welcome. ## VSA and VTCP Extending these interfaces, in their current form, to accommodate UDS is a piece of cake. VSA can just as easily encapsulate ``sockaddr_un`` as it currently does for the ip4 and ip6 types. For the most part, VTCP just works with sockets, so it doesn't matter whether they are TCP or UDS sockets. There would have to be some changes about naming (``VTCP_name``, ``_myname`` and ``_hisname``), but I'd like to set that aside for a moment, and get to the subject of naming further down. Some other changes would involve: * Unlink the UDS path before bind in ``VTCP_bind`` * Some new kinds of errors may result from ``VTCP_connect``, such as EPERM or ENOENT, but we may not have to change anything for that -- ``VTCP_connect`` currently just fails on error and lets the caller decide what to to with the errno. * We'll have to investigate which of the socket options are compatible with UDS. From a quick look I suspect that these are at least irrelevant to UDS and may be errors: * httpready * ``TCP_DEFER_ACCEPT`` * ``TCP_FASTOPEN`` * disabling Nagle (``TCP_NODELAY``) My main question about all this is: are there plans to significantly revise VSA and VTCP for the September release? Or can I expect that they it will remain fairly easy to extend for UDS? A minor issue is that the name ``VTCP`` (all of the ``VTCP_*`` functions, the source name ``vtcp.c``, etc.) becomes a misnomer if it also covers UDS. We could just live with that. OTOH a single git commit could change it all at once, although we might have to bikeshed over a new name (``VSOCK``?). ## Peer credentials The good news is that all of the platforms listed as level A and B in "Picking Platforms" (the phk rant) have the means to obtain credentials of the peer on a connected UDS. The bad news is that there's no standard, they're all different, and they encompass different information. * FreeBSD * ``getpeereid`` returns the EUID and EGID. OpenBSD appears to have ``getpeereid`` as well. * ``getsockopt(LOCAL_PEERCRED)`` returns credentials in the ``xucred`` struct defined in ````, which includes EUID and all of the groups to which the peer belongs. * Linux * ``getsockopt(SO_PEERCRED)`` returns the ``ucred`` struct defined in ```` which includes pid, uid and gid. It's not clear to me from the manuals whether it's EUID/EGID or RUID/RGID. (Googled-up examples seem to assume EUID/EGID.) * For ``getpeereid`` we'd have to link to libbsd. * Solaris * Appears to have nothing like any of the other platforms, but it does have ``getpeerucred``, which fills in a ``ucred_t`` defined in ````. This is an opaque structure with a [family of accessor functions](https://docs.oracle.com/cd/E53394_01/html/E54766/ucred-get-3c.html) ``ucred_get*``, which tell you almost anything you can think of. * MacOS/Darwin * Appears to be just like FreeBSD: ``getpeereid`` and ``getsockopt(LOCAL_PEERCRED)`` All of these obtain the credentials that were true when the peer called ``connect`` or ``listen``, and according to the docs they can't be faked (unless there's a kernel bug). Most or all of these platforms have ways to receive peer credentials in ancillary messages, which may contain more information, but that may require that the peer co-operates, and we can't rely on that. So it appears that the least common denominator is EUID and EGID (assuming that's what you get in Linux). I suggest that we just go with that, to be used as described below. Because of all of the platform dependencies, there will have to be something like ``cred_compat.h`` full of ``#ifdef``s, and probably some ``configure.ac`` logic to figure it all out. We'll also have to decide what to do when Varnish is built on a platform where we find none of the above. ## Address naming Getting back to ``VTCP_name``, ``_hisname`` and ``_myname``: these are currently hard-wired in their signatures for an address and a port, and they're spread out all over the place in Varnish. IMO the least obtrusive way to adapt this for UDS would be to generate the UDS path in the address position, and generate a string ``":"`` where the port is currently generated. Or we could bite the bullet by changing these three functions to something less hard-wired, then go find all of the places where they are called and figure out what to do. I suggest the less obtrusive option, at least in an initial implementation, although admittedly the more difficult option may be the right thing in the long run. Comments are welcome. Assuming we go for ``":"`` in the "port" position -- we could generate that string always using the numeric IDs. Or should we call getpwnam/getgrnam, and generate the names if we can get them? Comments welcome. We'd have to decide what to do on a platform where we don't have a way (or haven't figured out how) to get the peer credentials. Generate ``":"`` or ``"?:?"``? Comments welcome again. ## VCL/VRT Additions and changes to VCL and VRT involve: * VCL variables ``*.ip``: ``client.ip``, ``local.ip``, ``server.ip``, ``remote.ip`` and ``beresp.backend.ip`` * VCL data type IP * introducing VMOD std functions to return the uid and gid for the ``*.ip`` objects, as numbers or names * extending ACLs to specify UDSen and optionally peer credentials * VRT: types ``VCL_IP`` and ``struct vrt_backend``, and the VRT functions related to ``VCL_IP`` and ``suckaddr`` The ``*.ip`` variables essentially encapsulate suckaddrs, which we don't have to change. For the string conversion, if the suckaddr wraps a sockaddr_un, then return the UDS path. Here again we have the problem that the names ``*.ip`` are inappropriate, since the value could be a UDS. Again I suggest the strategy of introducing a new name, in this case ``*.addr``, and deprecating the old names, but leaving the old names around until a future release. ``VCL_IP`` is just a suckaddr, so we don't have to change anything, but we have another inappropriate name for UDSen. The same goes for data type ``IP``. Again I suggest the strategy of introducing new names, ``ADDR`` and ``VCL_ADDR`` (``VCL_ADDR`` defined as exactly the same typedef as ``VCL_IP``), and deprecating the old names. I suggest adding functions like these to VMOD std, with the obvious implementations: * ``INT uid_number(ADDR addr, INT fallback)`` * ``STRING uid_name(ADDR addr, STRING fallback)`` * ``INT gid_number(ADDR addr, INT fallback)`` * ``STRING gid_name(ADDR addr, STRING fallback)`` Of course these would always return the fallbacks for non-UDS addresses. ACLs can be extended to include paths for a UDS and restrictions on the uid/gid: ``` acl foo { "/path/to/uds"; "/path/with/a/*/wildcard"; "/path/with/a/uid/restriction",uid=4711; "/path/with/more/r?strictions",uid=foo,gid=bar; } ``` So we can: name UDS paths in an ACL, allow filename globbing, include restrictions on the uid and gid, and allow both numbers and names for uid/gid. I'm not sure what to do about ``struct vrt_backend``, which currently has fields for IPv4 and IPv6 addresses, both as strings and suckaddrs. I doubt that it makes sense just to add the same fields for UDS addresses, since the point is that a backend may have both kinds of IP addresses, but it won't also have a UDS address at the same time. We might have to introduce something like this: ``` union addr { struct { char *ipv4_addr; char *ipv6_addr; struct suckaddr *ipv4_suckaddr; struct suckaddr *ipv6_suckaddr; } ip; struct { char *path; struct suckaddr *uds_suckaddr; } uds; }; ``` ... and then use the union type for the "address" field of the backend definition -- it's either an IP address, which can be one or both of IPv4 and IPv6, or a UDS. Comments welcome. I think that the VRT functions that currently use ``VCL_IP`` and suckaddrs can be adapted either without changes or very straightforwardly, but again we'll want to introduce "addr" where "ip" currently appears in the names, and deprecate the old names: * ``VRT_acl_match``: use the ``VCL_ADDR`` type in the signature * ``VRT_ipcmp``: no change * ``VRT_IP_string``: introduce ``char *VRT_ADDR_string(VRT_CTX, VCL_ADDR)`` with the same function, and deprecate the old one From dridi at varni.sh Tue Apr 25 14:42:08 2017 From: dridi at varni.sh (Dridi Boukelmoune) Date: Tue, 25 Apr 2017 16:42:08 +0200 Subject: RFC for VIP17: unix domain sockets for listen and backend addresses In-Reply-To: References: <427ae252-6daa-d5ba-c7a5-f0b513ef1f2f@uplex.de> Message-ID: > # Synopsis > Allow unix domain sockets as a listen address for Varnish (``-a`` > option) and as addresses for backends, and for use in ACLs. Obtain > credentials of the peer process connected on a UDS, such as uid and > gid, for use in VCL. Except for ACLs, I find the idea compelling so far. I would even like to see UDS support for admin sockets (-T option). > # Why? > * Eliminate the overhead of TCP/loopback for connections with peers > that are colocated on a host with Varnish. Yes. > * Restrict who can send requests to Varnish by setting permissions on > the UDS path of the listen address. The whole point of UDSs IMO. > * (But see the discussion below about getting this right portably.) > * Make it possible for a backend peer to require restricted > credentials for the Varnish process by setting permissions on the UDS > path on which it listens. It is technically possible to implement a UDS backend if one is brave enough to re-implement all the VBE logic. So I'm strongly in favor of having this capability in varnishd. > * Peer credentials make it possible to: > * Make information about the peer available in VCL and the log. Why not, no opinion. > * Extend ACLs to make it possible to place further restrictions on > peers connecting to the listen address. I would use a regex instead of messing with ACLs. <...snip...> > I would like to make this contribution for the September 2017 release. > With the VIP I'd like to clarify: > > * Are there any changes planned for VTCP and VSA in the September > release that would make adding UDS to those interfaces less trivial > than it is now? I wouldn't mix UDS with VSA, there's probably room for a different solution. > * Every platform has a way to get peer credentials from a UDS, but > there's no standard and it's highly platform-dependent. So how do we > want to handle that? Maybe we could start by not having them, being able to use UDSs is already a huge win IMO. > * Additions/changes to VCL and other changes in naming, such as the > ``-a`` option and backend definitions. I don't think we need to change -a or -T, as long as we force absolute names we should be able to get away with the current syntax. An address starting with a slash (/) would denote a UDS. [1] > * If someone knows a reason why we shouldn't do this at all, this is > the place to say so. > > # How? > ## Address notation > I suggest that we require a prefix such as ``unix:`` to identify UDS > addresses (nginx uses ``unix:``, haproxy uses ``unix@``): > ``` > varnishd -a unix:/path/to/uds This should be enough: varnishd -a /path/to/uds > backend uds { .host = "unix:/path/to/uds"; } I would instead go for a .path field mutually exclusive with .host and .port, removing ambiguity at vcl.load-time (error messages etc). > ``` > That makes the interpretation unambiguous. We could simply interpret > paths as UDS addresses when they appear in those places, but then we > would need logic like: if the argument cannot be resolved as a host or > parsed as an IP address, then assume it's a path for UDS, but if the > path does not exist or cannot be accessed, then fail. So better to > just make it unambiguous. As I said earlier, I think a slash [1] is enough to remove ambiguity. > Parsing UDS addresses would be an extension of ``VSS_Resolver``. Not if we don't mix paths with IPs/domains > The name ``.host`` in a backend definition becomes a bit peculiar if > its value can also be a UDS (we will see a number or examples like > this). We could: > > * stay with the name ``.host``, and document the fact that it might > not identify a host in some cases > * replace ``.host`` with a name like ``.peer``, sacrificing backward > compatibility > * introduce ``.peer``, retain ``.host`` as a deprecated alias, and > remove ``.host`` in a future release > > I suggest the last option, comments welcome. Once again, I suggest we don't mix them up so that we don't need to break anything. I also find .peer ambiguous. > ``.port`` in a backend definition is already optional, and is > unnecessary for a UDS. Should it be an error to specify a port when a > UDS is specified, or should it be ignored? Comments welcome. As stated above, mutually exclusive with the .path field. > ## Access permissions on the listen address > For the ``-a`` address, I suggest an optional means of specifying who > can access the UDS: > ``` > varnishd -a unix:/path/to/uds:uid=foo,gid=bar > ``` > There's an issue here in that the separator (``:`` in the example) > could not appear in any UDS path. We might just have to forbid a > certain character in UDS paths. Fortunately we don't have a such a > problem with backend addresses (which are generated by another server, > so we have less freedom to impose restrictions on the path names). I would use the comma separator like -j and -s options for jails and storage backends. Possibly named parameters like in -j so that order doesn't matter. But that means breaking the syntax so that the protocol (HTTP/1 or PROXY) requires a name too. Example: varnishd -a /path/to/socket,uid=...,gid=...,proto=PROXY <...snip...> > If no access restrictions were requested, then don't manipulate > ownership, let bind create the UDS, and set its permissions to 0666. Wouldn't it be based on umask instead? <...snip...> > A minor issue is that the name ``VTCP`` (all of the ``VTCP_*`` > functions, the source name ``vtcp.c``, etc.) becomes a misnomer if it > also covers UDS. We could just live with that. OTOH a single git > commit could change it all at once, although we might have to bikeshed > over a new name (``VSOCK``?). VIPC? :p <...snip...> > So it appears that the least common denominator is EUID and EGID > (assuming that's what you get in Linux). I suggest that we just go > with that, to be used as described below. That, or nothing for starters. UDSs, huge win already. <...snip...> > ## VCL/VRT > Additions and changes to VCL and VRT involve: > * VCL variables ``*.ip``: ``client.ip``, ``local.ip``, ``server.ip``, > ``remote.ip`` and ``beresp.backend.ip`` Or we could leave them alone and introduce a new {server,local}.path field and depending on the type of connection one of .ip and .path is null. It is up to the VCL code to check that, and existing operations such as matching ACLs would fail with a null IP. > * VCL data type IP Or a separate type altogether (eg. UNIX). > * introducing VMOD std functions to return the uid and gid for the > ``*.ip`` objects, as numbers or names Yes, similar to std.port > * extending ACLs to specify UDSen and optionally peer credentials Not compelling. > * VRT: types ``VCL_IP`` and ``struct vrt_backend``, and the VRT > functions related to ``VCL_IP`` and ``suckaddr`` > > The ``*.ip`` variables essentially encapsulate suckaddrs, which we > don't have to change. For the string conversion, if the suckaddr wraps > a sockaddr_un, then return the UDS path. Again, not sure we should mix them up. > Here again we have the problem that the names ``*.ip`` are > inappropriate, since the value could be a UDS. Again I suggest the > strategy of introducing a new name, in this case ``*.addr``, and > deprecating the old names, but leaving the old names around until a > future release. Again, not mixing them up won't leave us with deprecated syntax. > ``VCL_IP`` is just a suckaddr, so we don't have to change anything, > but we have another inappropriate name for UDSen. The same goes for > data type ``IP``. Again I suggest the strategy of introducing new > names, ``ADDR`` and ``VCL_ADDR`` (``VCL_ADDR`` defined as exactly the > same typedef as ``VCL_IP``), and deprecating the old names. bis repetita > I suggest adding functions like these to VMOD std, with the obvious > implementations: > * ``INT uid_number(ADDR addr, INT fallback)`` > * ``STRING uid_name(ADDR addr, STRING fallback)`` > * ``INT gid_number(ADDR addr, INT fallback)`` > * ``STRING gid_name(ADDR addr, STRING fallback)`` So maybe we should not support uid/gid for starters :) > Of course these would always return the fallbacks for non-UDS addresses. > > ACLs can be extended to include paths for a UDS and restrictions on the uid/gid: > ``` > acl foo { > "/path/to/uds"; > "/path/with/a/*/wildcard"; > "/path/with/a/uid/restriction",uid=4711; > "/path/with/more/r?strictions",uid=foo,gid=bar; > } > ``` > > So we can: name UDS paths in an ACL, allow filename globbing, include > restrictions on the uid and gid, and allow both numbers and names for > uid/gid. Not compelling. > I'm not sure what to do about ``struct vrt_backend``, which currently > has fields for IPv4 and IPv6 addresses, both as strings and suckaddrs. > I doubt that it makes sense just to add the same fields for UDS > addresses, since the point is that a backend may have both kinds of IP > addresses, but it won't also have a UDS address at the same time. > > We might have to introduce something like this: > ``` > union addr { > struct { > char *ipv4_addr; > char *ipv6_addr; > struct suckaddr *ipv4_suckaddr; > struct suckaddr *ipv6_suckaddr; > } ip; > struct { > char *path; > struct suckaddr *uds_suckaddr; > } uds; > }; > ``` > ... and then use the union type for the "address" field of the backend > definition -- it's either an IP address, which can be one or both of > IPv4 and IPv6, or a UDS. Comments welcome. Or keep those fields mutually exclusive and handle the differences in the VBE subsystem for the backend side. On the client side I haven't given much thought to the mechanics. > I think that the VRT functions that currently use ``VCL_IP`` and > suckaddrs can be adapted either without changes or very > straightforwardly, but again we'll want to introduce "addr" where "ip" > currently appears in the names, and deprecate the old names: > * ``VRT_acl_match``: use the ``VCL_ADDR`` type in the signature > * ``VRT_ipcmp``: no change > * ``VRT_IP_string``: introduce ``char *VRT_ADDR_string(VRT_CTX, > VCL_ADDR)`` with the same function, and deprecate the old one I do not need to say again that I'm not thrilled by the idea of mixing them up :) As promised I shared my feedback. This looks like more work than I anticipated, but I'd really love to see UDS support landing in Varnish. Cheers, Dridi [1] we can't have a slash in a domain name, right? From sollie at sparkz.no Wed Apr 26 10:18:00 2017 From: sollie at sparkz.no (sollie) Date: Wed, 26 Apr 2017 05:18:00 -0500 Subject: =?utf-8?B?4p2kUmU6IGhvdyBsb3ZlbHkgaXQgaXMh?= Message-ID: <1071614421.20170426131800@sparkz.no> Greetings, There is something nice I wanted to show you, it is so lovely, just take a look http://lifestylecoo.com/street.php?2928 Later, sollie From dridi at varni.sh Thu Apr 27 07:16:09 2017 From: dridi at varni.sh (Dridi Boukelmoune) Date: Thu, 27 Apr 2017 09:16:09 +0200 Subject: [master] 3d60db6 Sort out some signed/unsigned mismatches In-Reply-To: References: Message-ID: On Thu, Apr 27, 2017 at 8:40 AM, Poul-Henning Kamp wrote: > > commit 3d60db6b8ec5de2337e52b73163305ae7c5c0094 > Author: Poul-Henning Kamp > Date: Thu Apr 27 06:39:42 2017 +0000 > > Sort out some signed/unsigned mismatches Hello, We need to bump the soname, the whole commit is an ABI breakage. My suggestion is to merge all symbols in libvarnishapi.map in a brand new 2.0 version and ditch all the previous 1.x groups of symbols. Dridi From dridi at varni.sh Thu Apr 27 07:25:06 2017 From: dridi at varni.sh (Dridi Boukelmoune) Date: Thu, 27 Apr 2017 09:25:06 +0200 Subject: 6.0 planning planning Message-ID: Hello, Next Monday we're supposed to find out what we want to plan for September: https://github.com/varnishcache/varnish-cache/issues/2318 Next Monday is also May 1st, a holiday. Does Tuesday 2nd at the usual bugwash time work for everyone? Cheers, Dridi From phk at phk.freebsd.dk Thu Apr 27 09:12:46 2017 From: phk at phk.freebsd.dk (Poul-Henning Kamp) Date: Thu, 27 Apr 2017 09:12:46 +0000 Subject: [master] 3d60db6 Sort out some signed/unsigned mismatches In-Reply-To: References: Message-ID: <73655.1493284366@critter.freebsd.dk> -------- In message , Dridi Boukelmoune writes: >On Thu, Apr 27, 2017 at 8:40 AM, Poul-Henning Kamp wrote: >> >> commit 3d60db6b8ec5de2337e52b73163305ae7c5c0094 >> Author: Poul-Henning Kamp >> Date: Thu Apr 27 06:39:42 2017 +0000 >> >> Sort out some signed/unsigned mismatches > >Hello, > >We need to bump the soname, the whole commit is an ABI breakage. > >My suggestion is to merge all symbols in libvarnishapi.map in a brand >new 2.0 version and ditch all the previous 1.x groups of symbols. We'll talk about that as we approach the next release. There will be a lot more API work before then. -- Poul-Henning Kamp | UNIX since Zilog Zeus 3.20 phk at FreeBSD.ORG | TCP/IP since RFC 956 FreeBSD committer | BSD since 4.3-tahoe Never attribute to malice what can adequately be explained by incompetence. From phk at phk.freebsd.dk Thu Apr 27 09:13:06 2017 From: phk at phk.freebsd.dk (Poul-Henning Kamp) Date: Thu, 27 Apr 2017 09:13:06 +0000 Subject: 6.0 planning planning In-Reply-To: References: Message-ID: <73668.1493284386@critter.freebsd.dk> -------- In message , Dridi Boukelmoune writes: >Next Monday we're supposed to find out what we want to plan for September: > >https://github.com/varnishcache/varnish-cache/issues/2318 > >Next Monday is also May 1st, a holiday. Does Tuesday 2nd at the >usual bugwash time work for everyone? Works for me. -- Poul-Henning Kamp | UNIX since Zilog Zeus 3.20 phk at FreeBSD.ORG | TCP/IP since RFC 956 FreeBSD committer | BSD since 4.3-tahoe Never attribute to malice what can adequately be explained by incompetence. From dridi at varni.sh Thu Apr 27 09:16:13 2017 From: dridi at varni.sh (Dridi Boukelmoune) Date: Thu, 27 Apr 2017 11:16:13 +0200 Subject: [master] 3d60db6 Sort out some signed/unsigned mismatches In-Reply-To: <73655.1493284366@critter.freebsd.dk> References: <73655.1493284366@critter.freebsd.dk> Message-ID: > We'll talk about that as we approach the next release. There will > be a lot more API work before then. I already mentioned it in the #2318 ticket, I figured it belonged to the 6.0 planning ;)