Working with HTTP cache
The fastest network request is a request not performed. That’s the job a HTTP cache: avoid unnecessary work. By understanding how it works, we can create web applications and APIs that are more responsive, by reducing the latency and the amount of used bandwidth.
There are two main types of cache: The private and the shared.
A private cache is what the web browser (or any other HTTP agent) stores locally, in each client’s computer.
A shared cache is something that sits between the client and the origin server, and can serve multiple clients. It acts as a proxy, that intercepts requests and decides if the origin server needs to be called.
There are two aspects of a request that are analysed before asking a new version of a representation: Freshness and validity.
When a representation that is stored in the cache is considered fresh, there is no need to even perform a request to the origin server, it can be
served right away.
There are two HTTP headers used to indicate if a representation is fresh or not:
Expires header is deprecated in HTTP 1.1, and you should avoid it when possible, but it is still widely supported and used, so we will talk about it here.
Expires header indicates when that representation should be considered stale (not fresh). It expects a specific HTTP date. Here’s an example:
HTTP/1.1 200 OK Content-Length: 31225 Content-type: text/html Expires: Mon, 29 Sep 2014 10:00:00 GMT [RESPONSE BODY]
Notice that if the date format is not correct, it will be considered stale. Also, you need to make sure that your web server clock and the cache are synchronized.
In HTTP 1.1, the
Expires header was deprecated and
Cache-Control is the alternative. If both
Cache-Control headers are found,
will be ignored.
Cache-Control works with a bunch of directives to specify how it should behave. We will talk about three of them:
You can see the entire list here.
max-age: This directive specifies for how many seconds (from the request time) the representation should be considered fresh. It works like the
but without the date issues.
private: Allows just a private cache to store it, but never a shared cache. This directive is used when the response is intended for a single user, so it makes no sense to store it in a shared cache.
no-cache: As the name says, it makes the request always be sent to the origin server.
Here’s an example:
HTTP/1.1 200 OK Content-Length: 31225 Content-Type: text/html Cache-Control: max-age=3600; private [RESPONSE BODY]
When a representation is considered stale (e.g. the
max-age was exceeded), a request must be sent to the origin server. Although we need to pay the price of
a network request, if we can identify that the representation is still the same, we can save some bandwidth by not sending this representation again.
That’s the job of the validation process, and this is done with what is called a conditional request.
There are two headers that can be used to support conditional requests,
Last-Modified header contains a date that tells the client when this representation last changed.
HTTP/1.1 200 OK Content-Length: 44181 Content-type: text/html Last-Modified: Sun, 28 Set 2014 10:00:00 GMT [RESPONSE BODY]
When a client receives a response that includes a
Last-Modified header, it takes note of that, and, when it needs to perform the same request again,
it includes a
If-Modified-Since in the request headers, with the date that it received before:
GET / HTTP/1.1 If-Modified-Since: Sun, 28 Set 2014 10:00:00 GMT [REQUEST BODY]
The origin server then checks if the representation was changed after the date received in the
If-Modified-Since header, and, if it was not changed, it
just sends a
304 Not Modified response:
HTTP/1.1 304 Not Modified Content-Length: 0 Last-Modified: Sun, 28 Set 2014 10:00:00 GMT
Even thought we still had to perform a network request, we avoid sending the same representation in the body, saving some bandwidth.
This is an “entity tag” that contains a string that changes whenever the representation changes. Usually a MD5 hash is used but it can be whatever you want.
It will work in the same way
Last-Modified does. The benefit is that you don’t need to keep track of the modification date of a representation, as long as you
use always the same algorithm to generate the
Etag value (and you should be using), it can be regenerated when you need.
HTTP/1.1 200 OK Content-Length: 44181 Content-type: text/html Etag: "78q9y7-b37r-0o9a3bc" [RESPONSE BODY]
The client will save this
Etag value and send it back in a
If-None-Match header for the next requests.
GET / HTTP/1.1 If-None-Match: "78q9y7-b37r-0o9a3bc" [REQUEST BODY]
Then, if the origin server determines that the received value is still the same for the generated representation,
it can just sent a
304 Not Modified, saving some bandwidth.
HTTP cache at work, step by step
Putting the pieces together, we can have this scenario:
1) A request is performed to
2) The HTTP agent checks if there is a
fresh copy of the requested representation. It does so my looking at the
If it finds a fresh copy, it just serves it to the client, and the origin server won’t even know this request existed.
3) If a
fresh copy is not found, the origin server will be asked to revalidate the representation, through a conditional request. This is done with the
4) If the origin server can validate the request, it will just return a
304 Not Modified response, and the client will keep using the representation
it already has stored.
HTTP cache at work, a practical example
To understand better this scenario, we will create a simple API that will incrementally add some cache capability.
I am going to use sinatra to create this API, and rack-cache as a reverse proxy cache. The same concepts could be applied with any other stack, I choose these two tools because they are pretty simple and won’t get in our way to understand how the cache is working, as this is our goal here.
rack-cache, in case you don’t have them installed already:
gem install sinatra rack-cache
Then, we will create a simple
sinatra app, without any caching capability:
# server.rb require 'sinatra' set :port, 1234 get '/' do # some interesting code would be executed here # for now, we are just sleeping for 5 seconds sleep 5 "the resource representation" end
To run this server, just run
ruby server.rb. It should be accessible at
http://localhost:1234. Notice that you’ll need to kill and start the server
again after each change.
When we send a request to this endpoint, you will notice that it will take 5 seconds until we get a response back. To create this request, I’m going to
curl(1) (with the
-i parameter, so we can see the headers).
$ curl -i http://localhost:1234 HTTP/1.1 200 OK Content-Type: text/html;charset=utf-8 Content-Length: 27 X-Xss-Protection: 1; mode=block X-Content-Type-Options: nosniff X-Frame-Options: SAMEORIGIN Server: WEBrick/1.3.1 (Ruby/2.1.2/2014-05-08) Date: Sun, 28 Sep 2014 19:54:16 GMT Connection: Keep-Alive the resource representation
We can see that there’s no cache-related header in this response. Every time we send this request, it’ll hit the origin server, and we’ll have to wait at least 5 seconds to get the response. Also, we are always receiving the response body, even if it didn’t change, causing unnecessary use of bandwidth.
So let’s start to fix this.
First, we are going to add
rack-cache as our reverse proxy cache. It should be pretty simple, as it’s just a rack middleware:
# server.rb require 'sinatra' # we require rack-cache require 'rack-cache' set :port, 1234 # and start using it use Rack::Cache get '/' do sleep 5 "the resource representation" end
Now that we have
rack-cache in place, we can start to take advantage of it. First we’ll add a
that is going to tell the client that this representation should be considered fresh for 10 seconds:
# server.rb require 'sinatra' require 'rack-cache' set :port, 1234 use Rack::Cache get '/' do sleep 5 # add a Cache-Controller header, setting the max-age to 10 seconds cache_control :public, max_age: 10 "the resource representation" end
And that’s it. If you try to hit this endpoint again, here’s what you get:
$ curl -i http://localhost:1234 HTTP/1.1 200 OK Content-Type: text/html;charset=utf-8 Cache-Control: public, max-age=10 Content-Length: 27 Date: Sun, 28 Sep 2014 20:17:05 GMT X-Content-Digest: 904c355ca45f6806b252aa62329fa8ac149011ac Age: 0 X-Rack-Cache: stale, invalid, store X-Xss-Protection: 1; mode=block X-Content-Type-Options: nosniff X-Frame-Options: SAMEORIGIN Server: WEBrick/1.3.1 (Ruby/2.1.2/2014-05-08) Connection: Keep-Alive the resource representation
Now that we have the header
Cache-Control in place, the next request should return instantaneously, as it’s not hitting
the origin server. That’s all it takes to have the
freshness process working. The next step is the
and it is almost as easy.
So we are already saving some network traffic by avoiding unnecessary requests while the representation is still fresh, but once it gets stale, we are still retrieving the entire representation in the response body, even if it didn’t change at all. Let’s fix that.
# server.rb require 'sinatra' require 'rack-cache' set :port, 1234 use Rack::Cache get '/' do sleep 5 representation = "the resource representation" cache_control :public, max_age: 10 # we add the Etag header with a MD5 hash of # the representation etag Digest::MD5.hexdigest(representation) representation end
Now, performing the same request to this endpoint, when the representation is stale (10 seconds after the first request), this is what we get:
$ curl -i http://localhost:1234 HTTP/1.1 200 OK Content-Type: text/html;charset=utf-8 Cache-Control: public, max-age=10 Etag: "f8d36c97fa01826fe14c1989e373d6e4" Content-Length: 27 Date: Sun, 28 Sep 2014 20:29:48 GMT X-Content-Digest: 904c355ca45f6806b252aa62329fa8ac149011ac Age: 0 X-Rack-Cache: miss, store X-Xss-Protection: 1; mode=block X-Content-Type-Options: nosniff X-Frame-Options: SAMEORIGIN Server: WEBrick/1.3.1 (Ruby/2.1.2/2014-05-08) Connection: Keep-Alive the resource representation
We can see the
Etag header there. All we need to do is to save that value, and send it in the
If-None-Match header for the next request:
curl -i http://localhost:1234 --header 'If-None-Match: "f8d36c97fa01826fe14c1989e373d6e4"' HTTP/1.1 304 Not Modified Cache-Control: public, max-age=10 Etag: "f8d36c97fa01826fe14c1989e373d6e4" X-Content-Digest: 904c355ca45f6806b252aa62329fa8ac149011ac Date: Sun, 28 Sep 2014 20:31:02 GMT Age: 0 X-Rack-Cache: stale, valid, store X-Content-Type-Options: nosniff Server: WEBrick/1.3.1 (Ruby/2.1.2/2014-05-08) Connection: Keep-Alive
Now, instead of getting a
200 OK response, with the entire representation in the body, we get a
304 Not Modified, that does not include
a body message. That saves us some bandwidth, as we don’t need to send that entire representation, that can be pretty big, in the response.
In a time where performance is a feature, doing good use of HTTP caching is one of the simplest ways to create applications and APIs that are more responsive. With the tools that we have available today, it’s becoming easier and easier to use these well-established HTTP capabilities, but understanding how they work is the first step, as none of these tools will be able to understand your specific requirements.
Interested in learning Kubernetes?
I just published a new book called Kubernetes in Practice, you can use the discount code blog to get 10% off.