Overview of Browser caching for Web developers (Part 1)

Friday, 15/05/2020

Tram Ho

Why caching is important

Browsers will often save copies of local static assets to reduce load time and minimize the amount of data that must be transferred, which is called caching.

Retrieving data from a source on the network takes longer than retrieving from the local, which is obvious because the connection from an external server is always weaker than connecting in the local environment. So caching will help reduce load times, along with not downloading unnecessary data also helps reduce the amount of traffic that must be transferred.

How does browser caching work?

Case 1: The user has never visited the site

In this case, the browser has no cache files yet, so it will download the entire data from the server.

Below is a screenshot of the resources that were downloaded when we first visited the Wiki home page. The status bar below shows that 265kb has been transferred to the browser.

Case 2: The user has visited the site before

The browser will still download the HTML from the server, but will consider downloading or not for static assets (Javascript, CSS, images)!

We can see the difference when we refresh the Wiki homepage:

The amount of data transferred was reduced to 928 bytes – equivalent to 0.3% of the original amount of data. The Size column indicates that most of the data is taken from the cache.

Chrome will retrieve the file from both memory cache and disk cache. Since we haven’t closed the browser window from case 1, the data is still in the memory cache.

Show cache in the browser

In Chrome, we can access and chrome://cache to view the cache contents. It will display the links to the pages containing the specific cache contents of the pages.

How does the browser know which file to retrieve from the cache?

The browser will check the HTTP response’s header from the server to see what content to download. There will be 4 commonly used headers for caching:

ETag
Cache-Control
Expires
Last-Modified

ETag

An ETag (or Entity Tag) is a string used as a validation token cache. It is usually the hash of the file contents.

The server can add the ETag header to the HTTP response, later the browser can use this header in the following requests (in case the file has expired cache) to check if the file content has changed, because using the function hash, so even if the file contents change a bit, the ETag string will be different.

If the hash string is kept intact, meaning that the resource content has not changed, the server will return code 304 (Not modified) with an empty body. This tells the browser that the file can still be cached.

Note that ETag is only used in requests when the file has expired cache.

Cache-Control

The Cache-Control header has a number of values we can use to control behavior, expiration, and validation of the cache. All of the above cache properties can be mixed together.

Cache behavior

Cache-Control: public

1 2	Cache-Control: public

public means resource content can be cached by any type of cache (browser, CDN, …)

Cache-Control: private

1 2	Cache-Control: private

private means only the browser is allowed to cache the content.

Cache-Control: no-store

1 2	Cache-Control: no-store

no-store means this content must always be downloaded from the server

Cache-Control: no-cache

1 2	Cache-Control: no-cache

no-cache is the most misleading value, no does not mean “don’t have cache”. This value tells the browser to cache the file but only use it when validated with the server that this is the latest version of the file. Validation will be used with the ETag header.

This value is often used with HTML files because browsers often have to check for the latest markup.

Expiration

Cache-Control: max-age=60

1 2	Cache-Control: max-age=60

This value sets the time at which the file should be cached. Values after the = sign are calculated in seconds. So in the example above, the file will be cached for 1 minute (60 seconds). RFC recommends that this value not be exceeded for more than 1 year (max-age = 31536000)

In addition, for caching on CDN, we can install the following:

Cache-Control: s-max-age=60

1 2	Cache-Control: s-max-age=60

Validation

Cache-Control: must-revalidate

1 2	Cache-Control: must-revalidate

This value will require the browser to always validate the cache (using ETag) regardless of the expires value.

Expires

Expires are headers from HTTP 1.0, but there are still many pages that use this header. This header provides an expiration date for the files, after which time the files will become invalid.

Expires: Wed, 25 Jul 2018 21:00:00 GMT

1 2	Expires: Wed, 25 Jul 2018 21:00:00 GMT

Note, the browser will ignore this header if Cache-Control max-age is specified

Last-Modified

Last-Modified is also a header from HTTP 1.0, which saves the last time the file was modified:

Last-Modified: Mon, 12 Dec 2016 14:45:00 GMT

1 2	Last-Modified: Mon, 12 Dec 2016 14:45:00 GMT

HTML Meta Tag

Before the advent of HTML5, using HTML meta tags to control Cache-Control was also a common way:

&lt;meta http-equiv="Cache-control" content="no-cache"&gt;

1 2	<meta http-equiv="Cache-control" content="no-cache">

However, using meta tags like this is currently not allowed. Because with the meta tag, only the browser can read and cache the data, and the intermidate cache will not be able to understand.

So always use HTTP headers for caching.

HTTP response

Take a look at the following HTTP response example:

Accept-Ranges: bytes
Cache-Control: max-age=3600
Connection: Keep-Alive
Content-Length: 4361
Content-Type: image/png
Date: Tue, 25 Jul 2017 17:26:16 GMT
ETag: "1109-554221c5c8540"
Expires: Tue, 25 Jul 2017 18:26:16 GMT
Keep-Alive: timeout=5, max=93
Last-Modified: Wed, 12 Jul 2017 17:26:05 GMT
Server: Apache

Accept-Ranges: bytes

Cache-Control: max-age=3600

Connection: Keep-Alive

Content-Length: 4361

Content-Type: image/png

Date: Tue, 25 Jul 2017 17:26:16 GMT

ETag: "1109-554221c5c8540"

Expires: Tue, 25 Jul 2017 18:26:16 GMT

Keep-Alive: timeout=5, max=93

Last-Modified: Wed, 12 Jul 2017 17:26:05 GMT

Server: Apache

Line 2 tells us that the max-age is 1 hour
Line 5 indicates that the file in question is a png image
Line 7 sends to ETag so that the browser can check if the file has been changed after 1 hour after receiving the file
Line 8 will be ignored due to the use of Cache-Control max-age
Line 10 shows the last time the file was modified

Source of the article

https://medium.com/@codebyamir/a-web-developers-guide-to-browser-caching-cc41f3b73e7c

Share the news now

Source : Viblo