Faking Surrogate Cache-Keys for Nginx Plus

Occasionally you are forced to use a technology to accomplish a job when you would otherwise consider a better alternative. One of these use cases is the usage of Nginx as caching solution.

HTTP Caching and Tools

There are a variety of reasons why you would want to use Nginx or Nginx Plus in your environment. It is an awesome reverse proxy and is helpful when you want to build up a scalable high performance Web Architecture. I especially like its capabilities when it comes to mixing up documents from various servers with its implementation of Server Side Includes (SSI).

And when this awesome piece of software is in place, it is often tempting to use all the capabilities it provides. HTTP caching is one of those things where the easy declarative syntax helps to enable a simple and straight-forward caching for your resources. And sometimes it is hard to argue for adding a standalone HTTP cache in front of your Nginx-Setup when Nginx would be a feasible solution itself. Additional infrastructure always comes with a cost and while Varnish, for instance, might be a much better Caching solution, it is not always feasible to add it to the infrastructure environment.

First, you define a caching location

proxy_cache_path /tmp/nginx levels=1:2 keys_zone=my_cache:10m max_size=200m inactive=60m use_temp_path=off;

and then you can store your resources in that cache:

location / {
  proxy_cache_key $my_cache_key;
  add_header X-Cache-Status $upstream_cache_status;
  ...
}

Requests that are targeting our location now will be served from the cache on subsequent requests, when the Cache-Control headers of the Response are set correctly.

curl -i localhost:8081/my-app/de/my-resource/4567
=>
HTTP/1.1 200 OK
Cache-Control: max-age=300, public
X-Cache-Status: MISS

curl -i localhost:8081/my-app/de/my-resource/4567
=>
HTTP/1.1 200 OK
Cache-Control: max-age=300, public
X-Cache-Status: HIT

Cache Purging

Let’s take a short break and discuss the basics of Cache Purging before we proceed. You can skip this segment if you are familiar with this.

You can (and in my opinion should) control via the Cache-Control header of your resource responses for how long a certain resource should be served from the cache. While this is a good start, there are resources that are not valid anymore in an instant.

Imagine you provide a website with book authors and their publications. Each page of your service describes an author and includes a list of the publications the author has written so far. Basically those pages including the publications can be cached for eternity, especially for authors which will never publish anything again (☠️). Let’s imagine a service with an API like this:

curl -i http://my-server.de/authors/douglas-adams
=> Content-Type: text/plain;

Name: Douglas Adams
Publications:
- *Dirk Gently's Holistic Detective Agency
- The Long Dark Tea-Time of the Soul
...*

Let’s just assume this for now so that we can add the appropriate Cache-Control header to our responses.

Cache-Control: public,max-age=31536000,immutable

This header tells our Nginx that this exact resource will always be served directly from our cache, because we specified that this resource will never change again.

But this assumption is only true until one of the more hardworking specimen publishes a new book. In exact that same moment, the page referencing this author will be outdated. Imagine a more beautiful world where Douglas Adams is still alive and has just recently published a new book. If we want our service to reflect this change, we need to kill that entry from our cache.

Beside the possibility of cleaning the complete cache for all authors, some caches provide the possibility of a more selective cache purging. Nginx, or better Nginx Plus, provides this capability of deleting a specific resource.

By extending our basic Nginx configuration as described above, we can enable this feature easily (for more information please give a look to this guide):

map $request_method $purge_method {
  PURGE 1;
  default 0;
}

location / {
  proxy_cache_key $my_cache_key;
  add_header X-Cache-Status $upstream_cache_status;
  proxy_cache_purge $purge_method;
}

With this in place, a simple HTTP request will clean the web page of Douglas Adams in our service.

curl -X PURGE -D - "http://my-server.de/authors/douglas-adams"

This is pretty awesome and powerful. But compared to other Cache implementations, this approach is very basic.

Resource Variants

A common use case for services like ours is that we won’t provide just a single website based on our author information, but many. Just imagine that our website would be provided in several languages. This would introduce a new dimension for our API where each language should be reflected in the API with its ISO-639 representation:

curl http://my-server.de/en-GB/authors/douglas-adams
=>
Name: Douglas Adams
Publications:
- *Dirk Gently's Holistic Detective Agency
- The Long Dark Tea-Time of the Soul*

### or

curl http://my-server.de/de-DE/authors/douglas-adams
=>
Name: Douglas Adams
Publications:
- *Dirk Gently's holistische Detektei
- Der lange dunkle Fünfuhrtee der Seele*

When our author is now keen to provide a new book to us, we would need to purge multiple pages from our cache like this:

curl -X PURGE -D - "http://my-server.de/authors/en-GB/douglas-adams"
curl -X PURGE -D - "http://my-server.de/authors/de-DE/douglas-adams"
...

Cache Purging of content variants is a widely spread problem and the amount of variants increases quickly when you add more dimensions to the problem. Imagine different results for different Content-Types, e.g. text/html , application/json and so on.

To circumvent the necessity to delete every Cache Entry manually, there are different strategies to purge a list of items from the Cache. A simple but restricted mechanism is provided by Nginx. You can append a _ as a wildcard to your purging request, which will then purge all entries matched by that wildcard. But as the wildcard can only be appended to the purge request, we are a little bit stuck here. The wildcard- _ cannot be placed in the middle or beginning of our URI.

As you can see in our API, the dynamic dimension of the path-segment of our URI is not in the end of our resource.

In a naive approach, this strategy doesn’t help us a lot, or more explicitly, it would mean that we still would need to delete all author pages, even if only one has changed:

curl -X PURGE -D - "http://my-server.de/authors/*"

Surrogate Keys / Tags

A better strategy is provided by other Cache Implementations. Those caches allow to specify something like specific keys or tags alongside a response, which allows to identify and group content variants semantically.

Imagine that we could provide a tag like author:douglas-adams to all documents that are related to Douglas Adams in our service. Once Douglas makes it back to publish his new masterpiece, we could purge all items by providing this tag to the cache-purger and all related items could be deleted.

Sadly this is not really possible in Nginx. But, as mentioned above, sometimes you have no choice or maybe you do not need the full power of real surrogate keys.

In this case, the basic wildcard-* approach of Nginx can be used to bring us pretty near to what Varnish et. al. offer to us.

Fake your Surrogates

The exact sentence of the Nginx documentation, that describes our troubles, can also be our way out (see proxy_cache_purge directive for more details)

If the cache key of a purge request ends with an asterisk (“*”), all cache entries matching the wildcard key will be removed from the cache

By default, this means that it “wildcards” the end of the URI path segment, because this is the final element of the default cache key. The default cache key is defined like this:

proxy_cache_key $scheme$proxy_host$uri$is_args$args;

As we can see in the documentation, the cache key will be matched with the asterisk at the end of the pattern. As the asterisk is not matched against the URI but against the cache-key, that we provide, we are offered with a simple but powerful solution to our problem. The cache-key is a property that we are able to change easily. This means, we can manipulate its contents, and more importantly, the order of appearance of certain segments.

Let us stick to the API example of authors and their publications as described above. Our variants are built on the dimension of the different languages for an author page.

http://my-server.de/{LANGUAGE}/authors/douglas-adams

First, we check the incoming request, or more accurately, the $request_uri nginx variable and verify if it contains a * -wildcard in the end. We use a simple map directive to retrieve this information.

map $request_uri $has_wildcard_elem {
  ~^.*\/\*$ 1;
  default 0;
}

This binds the value 0 or 1 to the variable $has_wildcard_elem.

Additionally, we remove the wildcard from our $request_uri so that we can use it in subsequent transformations.

map $request_uri $cleaned_req_uri {
 ~^(?<resource_uri>(.*))\/\*$ $resource_uri;
 default $request_uri ;
}

Then, the request URI without * is bound to $cleaned_req_uri

From that cleaned URI path we can then retrieve the {LANGUAGE} segment of the URI and check in a list of the supported language tags of our service.

map $cleaned_req_uri $language_param {
  ~^/(?<lang_key>(de-DE|en-GB))\/.* $lang_key;
  default "";
}

As a result, we want to create a cache-key without any differences for variants of a resource. This means, we want to streamline the $request_uri so that it does not reflect the {LANGUAGE} param anymore. Instead, we replace it with the static string ALL_LANGUAGES.

map $cleaned_req_uri $lang_agnostic_request_uri {
 ~^/(de-DE|en-GB)\/(?<rest>(.*)) /ALL_LANGUAGES/$rest;
}

The resulting $lang_agnostic_request_uri will be the first part of our final cache-key that we will build up in the end.

The other important key element that we need for our success is the surrogate-like identifier that helps us fake the surrogate key strategy. This map directive creates our $surrogate_key cache param based on our parsed$has_wildcard_elem parameter. The evaluation of that boolean value ends up in two different branches:

branch a) no wildcard purging:

If the request (purging or not) does not contain a * as last character, we build up the cache-key on the surrogate variable(s) that we have identified before. In our example it is only the language, but others, like the different Content-Types, like text/html, application/json etc., are thinkable too.

In which case you might want to add a normalised version of the $http_accept header to the$surrogate_key evaluation. Other dimensions, like different representations of the same resources, can also be reflected in a similar manner.

map $has_wildcard_elem  $surrogate_key {
  1 "*";
  default $language_param;

  # others could be joined here in a similar way, e.g with an underscore like 'default ${language_param}_${my_other_dimension};'

}

With the cleaned $lang_agnostic_request_uri and$surrogate_key we have everything in place to define our final cache key:

proxy_cache_key $http_host$lang_agnostic_request_uri$is_args$args/$surrogate_key;

A call for our example resource would result in a cache-key where all dynamic parts like the language are moved. The string before the surrogate key should be the same for all variants of that resource.

my-server.de/ALL_LANGS/authors/douglas-adams/de-DE
my-server.de/ALL_LANGS/authors/douglas-adams/en-GB

branch b) wildcard purging:

when the *-Wildcard is provided to a purging request the $surrogate_key variable is set to the * character, which means that all cache entries identified by the surrogate will match and will be marked as inactive.

A purging request to invalidate is now one simple call to the resource:

curl -X PURGE -D - "http://my-server.de/de-DE/authors/douglas-adams/*"

The resulting cache-key based on our example has no reference to the language anymore and matches all surrogates.

my-server.de/ALL_LANGS/authors/douglas-adams/*

Summary

This approach is a manual hack to implement a feature which is not yet available in Nginx. Consider it a poor man’s approach with some (maybe minor) drawbacks:

The * as a resource identifier in the URI cannot be used beside our caching hack
Real surrogate keys are usually set in a more dynamic manner in the response of a resource. This procedure needs a manual approach to understand your resources and the structure at configuration level of your Nginx.
In a better implementation you can set multiple tags or keys for a single response, allowing to purge different groups of items by different criteria. This is possible in theory by, for instance, using something like a tree structure of fake surrogate keys as described above, but that doesn’t seem realistic as this would escalate very soon.

There are similar approaches like https://github.com/wandenberg/nginx-selective-cache-purge-module out there.

The advantage of the solution described above is that you stick to the Nginx native syntax. In the end, it is just about a couple of map directives. But the maintainability depends heavily on how many dimensions you have to consider when you build your surrogate key.

For me, this is a good example of how powerful the configuration language of Nginx can be, and I think the result is still maintainable and not too hard to understand.

Hopefully, this approach might help you to get out of your troubles 👋.

Header Photo by Andre Medvedev on Unsplash

Blog Post