Filter request- or response-headers with VMOD re2 sets¶
Why filter headers at all?¶
When we care about security, less is often more. If we avoid malicious headers reaching backends, they can not be used to exploit security issues.
In general, there is a denylist and an allowlist approach. Both can be efficiently implemented using vmod_re2.
The denylist approach is (way) less secure, but used by most commercial WAFs and CDNs with WAF-features, because it needs less customization. The allowlist approach is much more restrictive and provides best security.
Denylist with plain vcl¶
But before we dive into vmod_re2 let’s warm up with a very simple denylist-example using plain vcl:
sub vcl_recv {
unset req.http.Chaotic;
unset req.http.Evil;
if (req.http.Wicked ~ "(?i)^wicked$") {
unset req.http.Wicked;
}
...
All unwanted headers are sanitized. So far so good. But this doesn’t scale. If
you have thousands of patterns it will get really really slow, because patterns
are checked sequentially. Also you can’t implement an allowlist approach with
plain vcl. Using hdr_filter from vmod_re2 solves both problems.
Tip
To ensure they are working as expected, vtc files for use with
vinyltest are provided for all examples in this tutorial.
Download the vtc file for this first example
hdr_filter-simple-denylist-plain-vcl.vtc and run it using: vinyltest
<testfile.vtc>.
Make sure your PATH contains the vinyld binary. You can easily make
your own modifications to try out stuff. The reference manual for vtc is
available here.
Denylist with vmod_re2 and hdr_filter¶
So, to get startet with vmod_re2 and hdr_filter, let’s do the exact same
thing as before:
import re2;
sub vcl_init {
new deny = re2.set(anchor=start, case_sensitive=false);
deny.add("chaotic:");
deny.add("Evil:");
deny.add("Wicked: wicked$");
}
sub vcl_recv {
deny.hdr_filter(req, false); # "false" makes this filter a denylist
...
Tip
Download full example as vtc: hdr_filter-simple-denylist.vtc
Two parameters are used for the set.
anchor=startputs an implicit anchor^at the beginning of each regex. It is equivalent to:new deny = re2.set(case_sensitive=false); deny.add("^chaotic:"); deny.add("^Evil:"); deny.add("^Wicked: wicked$");
case_sensitive=falseis very useful for our use case because http-headers are case insensitive anyway and attackers could easily bypass our filters otherwise.
Important
Notice that the request is not rejected if headers matching the denylisted are
received, they get removed as if unset was called.
Allowlists with vmod_re2 and hdr_filter¶
Now let’s proceed to what we really want to do: filter out all headers except for a list of explicitly allowed headers:
import re2;
sub vcl_init {
new allow = re2.set(anchor=start, case_sensitive=false);
allow.add("Host:");
allow.add("If-(Modified-Since|None-Match):");
allow.add("Non-Standard-Header:");
}
sub vcl_recv {
allow.hdr_filter(req, true); # "true" makes this filter an allowlist
...
Tip
Download full example as vtc: hdr_filter-simple-allowlist.vtc
The second parameter for hdr_filter may also be omitted, because true
for allowlist is the default anyway. All following examples will not contain
the second parameter.
Ok, now we filter out everything except the added patterns, which makes it a
really restrictive and secure setup. Currently, we do this in vcl_recv,
which is pretty early.
Filter in backend_fetch¶
There could be some reasons to delay the filtering until vcl_backend_fetch,
just before the request is send to the backends. Or to do an additional
filtering at this point. For example:
the client sends some headers we want to honour within vcl, but there is no need to send it to our backends
we might set some artificial request headers in vcl for internal purposes, which we also don’t want to send to our backends
we want to modify the cache-key in
vcl_hashbased on headers the backends don’t need
Fortunately hdr_filter can also be used on the backend side:
sub vcl_init {
new allow_recv = re2.set(anchor=start, case_sensitive=false);
allow_recv.add("Host:");
allow_recv.add("If-(Modified-Since|None-Match):");
allow_recv.add("Non-Standard-Header:");
new allow_backend_fetch = re2.set(anchor=start, case_sensitive=false);
allow_backend_fetch.add("Host:");
allow_backend_fetch.add("If-(Modified-Since|None-Match):");
}
sub vcl_recv {
allow_recv.hdr_filter(req);
...
}
sub vcl_backend_fetch {
allow_backend_fetch.hdr_filter(bereq);
...
}
Tip
Download full example as vtc:
hdr_filter-simple-allowlist-backend_fetch.vtc
Patterns in detail¶
Now let’s look in more detail at some examples of patterns:
fixed header name:
allow.add("Host:");
regular expressions for header names:
allow.add("X-Forwarded-(Host|Proto):");
prefix matches on header names:
allow.add("Accept");
Notice the missing colon! This matches all headers starting with
Acceptand any value, for exampleAccept: xyz,Accept-Encoding: gzipandAccept-Language: en-USregular expression for header name and value:
allow.add("Content-Type: \w+/\w+(; charset=\w+)?$");
Tip
Download examples as vtc: hdr_filter-simple-allowlist-pattern.vtc
Response header filtering¶
So far we’ve been looking at filtering request headers of incoming requests, which is certainly the most important use case from a security perspective.
But vmod_re2 is also capable of response header filtering. You might need this in some cases, for example:
For misbehaving backends, which are not under your control, sending response headers you don’t want, you can filter in
vcl_backend_responseusing.hdr_filter(beresp).Response headers from backends steering Vinyl Cache behaviour, which should not leak to clients, can either be filtered in
vcl_backend_responseafter having been avaluated using.hdr_filter(beresp), or invcl_deliverusing.hdr_filter(resp).To prevent information disclosure through debug headers or other internal headers, filter in
vcl_deliverusing.hdr_filter(resp).
To summarize all options:
client side backend |
side |
|
|---|---|---|
request |
|
|
response |
|
|
Response header filtering example¶
Let’s take a look at a simple denylist example for response headers. Of course you could use allowlists, but in contrast to request header filtering, for response headers using denylists can be more practical if the backends can be trusted:
sub vcl_init {
new deny_beresp = re2.set(anchor=start, case_sensitive=false);
deny_beresp.add("Some-debug-header-send-by-backends: ");
}
sub vcl_backend_response {
deny_beresp.hdr_filter(beresp, false); # false makes this filter a denylist
}
Tip
Download example as vtc: hdr_filter-simple-response-denylist.vtc
Depending on your scenario you could do the same with hdr_filter(resp,
false) in vcl_deliver, the vtc-file includes an example.
And yes, if you only use denylists and only a bunch of headers like in this
example, you could also use plain vcl unset, but the more headers you want
to filter, the more useful vmod_re2 gets in addition to being more efficient.
And if you want to use allowlists there is no alternative anyway.
After this excursion to response header filtering let’s get back to request header filtering.
Which headers should I put on my allowlist without breaking stuff?¶
We are looking at the case to build an allowlist for request headers. He’s some practical assistance how to analyse your live traffic:
# for a quick overview, just dump the content of the VSL buffer:
vinyllog -d -g request -c -i Reqheader \
-q 'RespStatus >= 200 and RespStatus <=399' \
-w /tmp/reqheader
# for more data let vinyllog run for longer, for example for one day:
timeout 1d vinyllog -g request -c -i Reqheader \
-q 'RespStatus >= 200 and RespStatus <=399' \
-w /tmp/reqheader
# you may also add a filter to only trusted IP-adresses not sending malicious headers:
timeout 1d vinyllog -g request -c -i Reqheader \
-q 'RespStatus >= 200 and RespStatus <=399 and ReqStart ~"^10\.72\."' \
-w /tmp/reqheader
We group by request to grab ReqHeader while still being able to filter the
requests to just those with status >=200 and <=399 responses. This might be
useful to reduce the amount of illegal headers. For example, if you have Apache
with mod_security running it will respond to requests classified as malicious
with a 403 status.
If you let vinyllog run for a longer time, please make sure you have enough
space in /tmp/ or choose another location.
When it’s time for the evaluation, you might want to run:
vinyllog -r /tmp/reqheader | awk '$2 == "ReqHeader" {print tolower($3)}' |
head -10000 | sort | uniq -c | sort -n
You’ll get a histogram like this, the first column being the number of occurrences of the respective headername in your logfile:
...
109 authorization:
118 priority:
146 content-length:
168 sec-fetch-dest:
168 sec-fetch-site:
195 content-type:
219 sec-fetch-mode:
224 cookie:
257 referer:
312 accept-language:
394 connection:
423 accept:
490 user-agent:
555 host:
555 via:
600 accept-encoding:
This might serve as a good starting point to build your own allowlist.
Documentation / Further Reading¶
The current documentation for vmod re2 is available using man vmod_re2 or
online here.
Contributing¶
If you found any mistake in this tutorial, I’d like to cite Poul-Henning: “We’d absolutely love to have you help improve the project homepage, send us pull requests!” https://code.vinyl-cache.org/vinyl-cache/homepage