The browser cache is a mechanism used by browsers to store web page resources locally. This adds performance gains, minimizes bandwidth consumption, and overall creates a snappier experience. In this article, we explain how browser caching works, how to implement it on your website and how we handle it at Pressidium.
What is the browser cache?
The cache is a software or hardware component that is used to temporarily store values for faster future access. The browser cache is a small database of files that contains downloaded web page resources, such as images, videos, CSS, Javascript, and so on. The basic idea behind it is the following:
The browser requests some content from the web server. If the content is not in the browser cache then it is retrieved directly from the web server. If the content was previously cached, the browser bypasses the server and loads the content directly from its cache.
Content is considered to be stale depending on whether the cached content has expired or not. Fresh, on the other hand, means that the content hasn’t passed its expiration date and can be served directly from the browser cache without involving the server.
The term validation refers to content that needs to be checked against the most recent version that the server possesses. In short, to determine whether the content is stale or not. Invalidation occurs when the content has been removed from the cache before the expiration date passes. This can be forced by the server, in cases when the content has changed, and the browser needs to have the most recent version so as to not introduce any problems.
Browser caching can be leveraged by web developers and administrators through the use of specific HTTP headers. These headers instruct the web browser when to cache a resource, when not to, and for how long. Using HTTP cache-related headers can be oftentimes frustrating since there is quite an overlap with headers throughout the various reincarnations of the HTTP protocol. Add to the mix things such as the odd web proxy in between, old browsers, conflicting caching policies and implementations (for example different WordPress plugins) and it can quickly become a headache.
Browser caching headers
The set of rules defining what can be cached or not and for how long is called the caching policy. This is implemented by the website owner through the use of caching response headers. Although this can be achieved in many different ways you should probably concern yourself only with Cache-control, at the beginning.
Cache-control
The Cache-control header was introduced in HTTP/1.1 and is considered the most modern implementation there is. There are several different values you can use, depending on how you want browsers to behave. Making it quite versatile. Below is a list of Cache-Control directives:
- No-cache
Instructs your web browser not to refer to the cache immediately, but to validate the content against the server. If it is fresh, then it can be served from the cache. - No-store
Tells the browser to not cache the content in any way. It is mostly used when dealing with sensitive data or with data that changes frequently. - Public
Marks the content as public, which means it can be cached by the browser and any intermediary parties (such as proxies, etc). - Private
Used to mark the content as private, meaning it can be cached only by the browser, and not by intermediary proxies and such. This usually refers to user-related data. - Max-age
Max-age represents the maximum time in seconds that a content can remain in the browser cache before the client needs to revalidate it. Contrary to the Expires header which we will visit shortly, max-age defines a relative value in seconds from the time the content was cached, and not an absolute expiration date. - S-maxage
This is similar to max-age but it is only used for intermediary caches. - Must-revalidate
Forces the browser to revalidate the content every time it needs it, instead of just serving it directly from the browser cache.
This is handy in case a network interruption occurs. - Proxy-revalidate
Similar to must-revalidate, but applies only to intermediary caches. - no-transform
Instructs the browser to not transform the content received from the server in any way (usually compression etc).
Etag
The Etag response header is used to identify a specific resource. Every time a certain resource changes, a new Etag is generated. In this way, bandwidth is saved, because the web server does not need to give a full response if the Etag hasn’t changed. Consequently, the Etag header is enabled by default on Nginx and Apache, and Etag values are generated automatically, so you don’t have to specify anything.
Try our Award-Winning WordPress Hosting today!
Expires
This was introduced in HTTP/1.0 and defines a specific date in the future where the content will be considered stale. It is effectively a “best-before” date for content. For example, Expires: Thu, 25 May 2017 12:30:00 GMT
Pragma
This is a somewhat outdated HTTP/1.0 header that is mostly used for backwards compatibility. Inserting Pragma: no-cache will make your browser behave similarly to Cache-Control: no-cache.
How to implement a caching policy on your website
There are two ways to implement a caching policy on your website. The first is to define the caching response headers in the web-server configuration. The second is to do it directly within PHP. Below are examples of the two most popular web-servers, Apache2 and Nginx:
Apache2
<filesMatch ".(ico|pdf|flv|jpg|jpeg|png|gif|js|css|swf)$">
Header set Cache-Control "max-age=84600, public"
</filesMatch>
Nginx
location ~* \.(jpg|jpeg|png|gif|ico|css|js)$ {
expires 365d;
add_header Cache-Control "public";
}
As you can see it is pretty straightforward. In the first example, we use apache2’s FileMatch directive to match against a specific set of filetypes (.ico, .pdf, etc) and make them public, with a max-age of 84600 seconds. In the second, we again match against certain filetypes using nginx’s location directive, and include a max-age of 365 days. We also define them as “public’ using the add header clause.
PHP
If you wish to add response headers directly to your code, just use PHP’s header command.
<?php
header("Cache-Control: no-cache");
header("Expires: Sat, 26 Jul 1997 05:00:00 GMT");
header("Cache-Control: max-age=604800");
?>
How to test to see if it works
You can easily test to see your browser caching rules, by using, for example, the Firefox web console or Chrome’s Developer Tools:
- Click on the hamburger icon at the top right.
- Select More Tools > Developer Tools
- Enter your URL into the address bar and hit Enter.
You should see a list of requests as your URL is loading. Select a resource by clicking on it. Inspect the Response headers on the right, and particularly the Status Code. It prints the 200 HTTP Code but indicates in parentheses that it is from memory cache.
This means that the resource was loaded automatically from the local cache store, instead of requesting it from the server.
In cases where you have a “must-revalidate” clause on your Cache-Control header, the status code will be 304 (Not Modified). This means your browser revalidated the resource against the server, and the server responded that the content hasn’t changed, thus it can be served from the browser’s cache.
Proceed to disable the cache by checking the Disable cache checkbox and hitting Reload.
In this case, you see that the Status Code is 200 without any indication that it used the cache. The browser requested the resource from the web server, and the web server responded by serving back a fresh copy.
So, we delved into the browser cache and explained the various cache-related headers you can use in order to optimise your website and make it snappier. The HTTP protocol has gone through several iterations; HTTP/1.0 and 1.1 with each of those implementing their own set of browser caching headers. For example, HTTP/1.0 used the Expires
keyword along with Pragma: no-cache
directive to specify an exact “best by” date and to further instruct the browser not to cache anything if no caching was needed.
The different implementations, browsers, and server topology can quickly make browser caching a daunting task; but this is where we enter the picture.
Pressidium’s browser caching policy
We treat dynamic content differently from static content. Dynamic content is forced to be revalidated every time the browser requests it. This is only logical; if you cached dynamic content, then it wouldn’t be considered dynamic! This is implemented using the must-revalidate, max-age=0
directives in Cache-Control
. This forces the browser to ask the server if the copy it has in its cache is the most recent one. However, max-age=0
forces the browser not to keep anything in its cache. These two directives effectively make sure that dynamic content is always fetched from the server, no matter what.
Cached static content on the other hand (such as images, CSS files, javascript, etc) is configured to be good for 30 days.
Additionally, we use the Expires
keyword, strictly for backwards compatibility reasons; calculating its absolute value according to Cache-Control
‘s max-age
value. The Expires
keyword takes an absolute value in the form of an expiration date, whereas the Cache-Control: max-age=n
keyword specifies a relative value (n). This value specifies for how long the content can be considered fresh once it is cached.
HTTP/1.1 also introduced another header called Etag
that works as a sort of “fingerprint” for content. Every content gets associated with an Etag
value (which is generated by the web-server using a variety of ways). Whenever the content changes, the Etag
value changes as well, thus, signalling the need for the browser to get a fresh copy. Etag
is used in Pressidium only for static files, and it is generated using the file’s modification time and size. This simply means that the web-server generates a new Etag
value every time the content is modified.
Conclusion
Browser caching and caching policies can become quite complicated. But starting out experimenting with Cache-Control as we’ve demonstrated is straightforward. Most of the times implementing a “generic” caching policy for static assets is enough to make a difference to your website’s performance. However, it does add an extra layer of “worry” on top of so many other things, and we believe it shouldn’t be like that.
The implementation of a browser caching policy is something that you don’t need to fuss about if you’re a Pressidium client. Even if you have questions related to browsing caching or experience any problems, you can always contact us. We provide a 24/7 end-to-end support and we mean it.
Start Your 14 Day Free Trial
Try our award winning WordPress Hosting!