Go offline with application cache

by .

UPDATE 15 June 2011: .appcache is now the recommended file extension for cache manifests. Please ensure you update your filename, manifest attribute on the html element and set the correct mime-type in your server config.

HTML5 introduces new methods for enabling a web site or web application to function without a network connection. When you’re working on a mobile connection and your signal drops, or you just have no connection to the internet for whatever reason, having some level of access is better than nothing. In this article, we’ll look at how the application cache can store resources to be used by the browser when it’s offline, granting your users partial access to your web site or application.

The application cache is controlled by a plain text file called a manifest, which contains a list of resources to be stored for use when there is no network connectivity. The list can also define the conditions for caching, such as which pages should never be cached and even what to show the user when he follows a link to an uncached page.

If the user goes offline but has visited the site while online, the cached resources will be loaded so the user can still view the site in a limited form. By carefully considering the contents of your manifest file, you can offer a suitable web experience to a disconnected user.

The manifest file

Let’s start with an example of a full manifest file. (Don’t worry, I’ll explain it all in detail!)

CACHE MANIFEST
      
# This is a comment

CACHE:
/css/screen.css
/css/offline.css
/js/screen.js
/img/logo.png

http://example.com/css/styles.css

FALLBACK:
/ /offline.html

NETWORK:
*
An example application cache manifest file

Each directive is placed on a new line, with comments prefixed by a hash (#). The first line, CACHE MANIFEST, tells the browser that this is a manifest file. The uppercased lines with trailing colons are section headings.

There are three different sections in a manifest file:

CACHE
A list of explicit URLs to request and store
FALLBACK
What to do when an offline user attempts to access an uncached file
NETWORK
Which resources are available only while online

Each section serves a specific purpose that you must understand in order to successfully and effectively cache your resources.

CACHE

The CACHE section is considered the default — i.e., if no section heading has been defined, the browser will assume this is the CACHE section. Beneath this heading, you can list URIs to resources you want the browser to download and cache for offline use, including URIs hosted externally.

CACHE MANIFEST

/css/screen.css
/css/offline.css
/js/screen.js
/img/logo.png

http://example.com/css/widget.css
Telling the browser to cache some stylesheets, an image, and a JavaScript file

In this example, I’ve omitted the CACHE: heading to take advantage of the default behaviour. I have provided the browser with four paths relative to the root of the domain plus one external resource. When the browser downloads the cache manifest file, it will read these five resources, fetch them over HTTP, and store them for later use.

Every single resource that you want to cache explicitly should be listed here, right down to the last image. The browser is not aware of a resource unless you provide the full path to it. This means you can’t use wildcards. If you list /images/* as a resource, the browser will request that URI as if you typed it into your address bar.

But don’t run off and shove URIs for every single page on your site into your manifest! When a user visits a page that points to the manifest file, that page will also be cached. This means that if you want to allow users access to pages they’ve already viewed, just make those pages point to the manifest file and the browser will cache them appropriately.

Now let’s tell the browser what to do with uncached resources.

FALLBACK

The FALLBACK section tells the browser what to serve when the user tries to access an uncached resource while offline. Because of this, it looks a bit different than CACHE and NETWORK. It contains two values per line, separated by a space. The first value is the request URI to match, and the second is the resource sent upon matching. It caches the resource on the right for offline use, so this should be an explicit path.

Lost? Take a look at this example:

CACHE MANIFEST

FALLBACK:
/status.html /offline.html
Declaring a FALLBACK section

On the line below FALLBACK:, we have the URI “/status.html” followed by a second URI, “/offline.html”. We’re telling the browser that when an offline user requests a URI matching “/status.html”, it should instead serve the cached file “offline.html”.

However, the FALLBACK section can be far more powerful:

CACHE MANIFEST

FALLBACK:
/ /offline.html
Matching all resources

In this example, I’ve dropped “status.html” and simply provided “/” as the request URI to match. Now when an offline user requests a resource that matches “/”, he will be served “offline.html” in its place. So if the user clicked on a link for “/status.html”, “/about.html”, or even “/my/nested/resource.html”, the browser would match the “/” at the start and serve up “offline.html”. Since I’ve used the root path, every uncached resource under this domain will point to “offline.html”.

Errata 23 June 2011: this article has been corrected as you can’t use a wildcard with the FALLBACK or NETWORK namespaces – though you can use the asterisk symbol under NETWORK as it’s a special flag to indicate all urls should be whitelisted.

The CACHE section, both the FALLBACK and NETWORK namespaces support a prefix rule that aid their URI matching. In that any requests to the /avatar directory, whilst offline, if the asset is unavailable the browser can serve up an alternative.

CACHE MANIFEST

FALLBACK:
/images/avatars/ /offline_avatar.png
A smarter fallback declaration

The first line tells the browser to serve “/offline_avatar.png” in place of user-uploaded avatars.

Remember when I said that any document referencing the manifest will also be cached? Well, you can use this to your advantage! You can cache each page the user visits while online so that they will have access to that page while offline. Then anything they didn’t view will be caught by the FALLBACK section. This keeps you from explicitly stating you want all your pages cached, and, more importantly, avoids the huge performance penalty of serving all the resources you want cached every time someone first visits your site.

NETWORK

Finally, we have the NETWORK section, used to tell the browser explicitly which resources are only available while online. By default, this uses the asterisk * symbol, meaning all resources that are not cached will require a connection. Alternatively we can whitelist specific url prefixes, like all the avatars if we wish.

CACHE MANIFEST

NETWORK:
*
Adding a NETWORK section

You can explicitly define resources not to cache by providing a list of URIs — essentially a whitelist of online-only assets.

CACHE MANIFEST

NETWORK:
register.php
login.php
Excluding certain pages from caching

Serving the manifest

You can reference a manifest file on a web page by adding the manifest attribute to your opening <html> tag. The browser will only cache pages that include this attribute (in addition to those specified in the manifest itself, though in that instance, the user would have to visit a page including the manifest in order for the browser to be aware of it).

<!DOCTYPE html>
<html lang="en" manifest="/offline.appcache">
  // your html document
</html>
Referencing the manifest file from an HTML page

The linked file should also be served with a MIME-type of text/cache-manifest. If you’re using Apache as your web server, add this to your .htaccess file:

AddType text/cache-manifest .appcache

And there you have it! Supporting browsers will retrieve the manifest file and cache each item on the list for offline use. Won’t your parents be proud?

Triggering a cache refresh

Once a cache has been successfully downloaded, the browser will retain those assets until either the user clears the cache or you trigger an update. Triggering an update with your manifest file requires that the contents of that file change, not just the assets themselves.

Updating the assets on your server will not trigger a cache update. You must modify the manifest file.

If you’re adding or removing resources completely, you’ll have to edit your manifest file anyway. But what if you’re just amending an already cached stylesheet?

This is where comments come in handy. Just throw in a simple version number comment that you change when you want to trigger an update:

CACHE MANIFEST
      
# Version 9

CACHE:
/css/screen.css
A version comment in a manifest file

The next time you want to trigger a cache refresh, just increment the version number. When the user next visits the online version of a page including this manifest, it will re-download the manifest file, notice the change, download the listed assets, and purge the existing cache.

Browser bug: Firefox caches the manifest file itself and will not update it even if the manifest has changed on the server. With some server config wizardry, you can tell browsers that the cache of the manifest file is instantly invalidated and should be requested from the server every time it’s referenced. Add this to your .htaccess to put Firefox in its place:

<IfModule mod_expires.c>
  ExpiresActive On
  ExpiresByType text/cache-manifest "access plus 0 seconds"
</IfModule>

Conclusion

The application cache is a powerful beast, and to tame it you need to be clear on what’s involved. Give thought to your CACHE, FALLBACK, and NETWORK sections to provide a suitable offline experience to your users.

In a future article, we’ll show you how to use the applicationCache JavaScript object to manipulate the cache. Until then, this should be enough to get you started on the path to offline web content.

You can see a live demo using the application cache over on Doctor Remy’s HTML5 Demos. Happy caching!

34 Responses on the article “Go offline with application cache”

Pedro Morais says

You might also be interessed in this blog post I wrote a while ago, explaining how we use the appcache on a fairly complex webapp: HTML5 offline webapps: a practical example.

For instance, we don’t use a version number, instead we use fingerprints of each of the files.

jujhimup says

how does this affect HTTPS pages?

jujhimup says

actually: is this also a cache – as in, if there IS a network connection available, will the app still display the cached ‘offline’ version? or is this only used when there is no connection available?

Ted says

@jujhimup: “HTTP cache headers and restrictions on caching pages served over TLS (encrypted, using https:) are overridden by manifests. Thus, pages will not expire from an application cache before the user agent has updated it, and even applications served over TLS can be made to work offline” — http://www.w3.org/TR/html5/offline.html

Simon says

What determines whether an app is ‘offline’ or not – is it about access to the actual hosting server, or is it a browser-level online/offline state (e.g Firefox’s ‘Work Offline’ checkbox)?

E.g an app might be reliant on a server running inside a firewall, and so should be offline when out on the public internet. Or dependent on a server running on the physical machine, and thus working even when the machine is isolated from the world (Firefox auto-switching to offline mode has caused me problems more than once).

Jovica says

I need this for my presentation for a final exam at college so I’d be happy if you can answer me as soon as you can.
I have web application with only one page, written in GWT. GWT still doesn’t have support for making cache manifest file and I can’t write down all files cause their names are changed from build to build. So I want to show only /offline.html page if the user is off-line. Is there a way to do that or not?

Dmitri says

I wouldn’t recommend to force browser to re-download manifest file for each request. Seems like a waste of resources – another HTTP call. Instead your build tools could rename manifest file when you deploy new version of your app. Something like offline.manifest_{REV_NUM}. Also it is a good practice to put this file on a separate static domain together with stylesheets, images and javascript files. This will avoid sending useless cookie data for each HTTP call.

Another note about your demo app. When I first time visit your page I see both images you have in your markup. Then I switch off WIFI and reload the page and see no images. However, when I switch on WIFI back and reload the page, I still see no images. Tested in latest Safari on Mac.

Jukka K. Korpela says

My experiments suggest that the biggest problem with application caches is that they work all too well, on browsers that support them. That is, the files remain in the cache, no matter what.

Changing the manifest file on the server does not help if browsers do not request for them. And it appears that they mostly don’t, if the server does not specify some special caching-related information for it, when first requested. In the absence of explicit caching info for the manifest file, browsers will estimate some suitable age for, so they will use its cached version.

This can be fixed easily by setting e.g.
Expires: 0
in the HTTP headers for the manifest file (dirty but simple and working). But the problem is that it takes time to figure it out (none of the three books on HTML5 I’ve read mentions anything about this). Moreover, an author might have no way of affecting the HTTP headers.

Jochen Drechsler says

There are limitations on the size of the appcache in different browsers.
This was reported by several resources (for example: http://www.thecssninja.com/javascript/how-to-create-offline-webapps-on-the-iphone).

However it is not clear to me what this limit referes to.
Is it a general limit in the browser for all apps?
Or is it a limit for a single cache group (all files accosiated with one manifest)?
Is so: Can I use two html-files with different manifest-files to cache more data? And if I do so, would it be possible reference the two files within each other in such a way that I am able to navigate from one to the other while offline?

In other words: The idea is to split the webapp over several html-files in order to be able to cache more data that can be used offline.

TMC says

Assume user has previously cached resources in the HTML5 offline cache. If user reloads you page and is ONLINE, does it leverage the offline cache if the resource is cached there?

I’m trying to see if the HTML5 offline cache can be used as a more reliable caching mechanism that relying solely on the regular browser cache.

Lee says

Hi,

I have an online app that allows users to view up to 9 pdf’s, these pdf’s are set in a piece of software so can bge changed all the time. I need to make the app work offline, and hence the user to be able to select the pdf’s that they currently have selected? Any idea’s on how to do this? The cache manifest I have is working for everything on the app, apart from displaying the pdf’s?

nad says

I want something like this.
If the user is online, fetch pages from the server.
If the user is offline, fetch pages from the HTML5 App cache.
Can this be done with HTML5. Please comment

Tim says

This is the only page I could find with Google that has any reference to a “per cache group / manifest file” limit (in the comments by Jochen Dreschler). Has anyone tested this? If so, it would open the way for delivering larger applications through modularisation.

ram says

is it possible to cache user control(.ascx file) using app.manifest.if it possible can you give example

Tarple says

This is actually the first time I am commenting on an article online. I found this to be very clear, informative and precise. It was an easy read and it definitely helped me to understand how the manifest file works. Great job!

Ravi Prakash says

I have one question about Manifest.

Is it possible to only download files that have changes in it instead of re downloading the whole list of files from the manifest.

judaica says

I’ve set a counter on the manifest cache that logs the IP address of the browser. This way when the same IP visits it automatically refreshes the content. It works well.

Steve says

Super helpful, but I’m trying to figure out how to do this with a tiny wordpress site that I’m treating as a mobile “app.” Since everything is served from a database, would the same rules apply or does this just work for flat HTML?

Rock ON says

@Steve says: There is plugin for wordpress try search their database

Does anybody know how to configure this option on forum applications ? like phpbb or mybb ?

Micsi says

Are you sure AddType and ExpiresActive work in .htaccess? At least this must be allowed…
But I think it is better and easier to use those commands in the apache vhost config if you have access to those.
You need to restart apache for those changes to come into effect.

Sree says

Hi:
Whenever there is a change in cache-manifest file i.e. a resource is changed, do I need to redeploy my web application on server or can I update the files dynamically i.e. is there any way to update the files dynamically when the server is running.

Alessandro says

Browser bug: Firefox caches the manifest file itself and will not update it even if the manifest has changed on the server.

Since Firefox 14, It seems that this bug shows up again but only if you are caching more than 50MB. We filled a ticket. https://bugzilla.mozilla.org/show_bug.cgi?id=780878

The manifest file is fetched the first time and then Firefox will never reload it again to see if something changed.

Does someone have the same problem ?

Jon Humphrey says

First, I apologise for my lengthy post below, I’m at a loss and need anyone’s help! I would be happy to work out recompense for any and all assistance if it get’s this application “offline” and working for my client!

Here goes … I’m in the process of creating an ‘offline’ meeting notes web app that is to be used specifically on an ipad? As the project is covered by an NDA and behind a corporate firewall I will happily answer any questions you may ask, and even show code if need be, but cannot give links to see it in action.

Now, I’ve followed everything I can find about the offline HTML5 features found here and on other sites … from creating the appcache and setting mime-type expire headers to local storage and sqlite databases; we’ve even gone as far as porting the two “offline” pages from c# into html to make things completely non-server dependant as best we can!

The scenario runs like this:

We browse to the “online” aspx meeting listing page, then choose the meetings to “sync to offline” which pushes the relevant information into the local sqlite database and loads the first html page with the same list from the new “offline” local db. We can then choose a specific meeting from the list and click through to the details pertaining to that meeting on the second html page … again loaded from the “offline” local db. We can even add or edit the notes for the specific meeting and then sync those individual changes to the “online” database when a network connection is detected without any problem!

However, if we turn on Airplane mode, and try to browse to the same pages we were just on, no matter what we do … we cannot seem to get them to load! The error: “Safari cannot open the page because your ipad is not connected to the internet” is our bane!

We know we’re not connected … that’s the whole point!!!!

Could someone … anyone … please help us out … I would rip my hair out if I had any left … this just doesn’t make sense?

Thank you for your time and patience,
J

Pegi says

I’m experiencing a similar problem to Jon only it’s happening on my PC. As soon as I go offline, the browser doesn’t display the offline content it’s supposed to. Instead it gives me an error message explaining that the page can’t be displayed because there’s no connection to the Internet.

I have read and re-read every resource I can find and no one seems to mention this so I’m really stumped. There doesn’t appear to be anything wrong with my coding and the browser does ask if it is allowed to place a cache on the machine. So it appears to be working, just not working offline. Any help would really, really be appreciated.

Cody says

Here’s what my offline.manifest looks like:

CACHE MANIFEST

CACHE:
../style.css
../js/jquery-1.7.1.min.js
../js/jquery-ui-1.10.1.custom.min.js
../js/jquery-ui-1.10.1.custom.min.css
../js/underscore-min.js
../js/SDB.js
../js/ndn/ndn-js.js
../js/ndn/ndn-closures.js
../js/ndn/ndn-utils.js
../client.js
../plugins/factory/factory.js
../plugins/favicon/favicon.js
../plugins/pagefold/pagefold.js
cacheManifest.js
offline.js

NETWORK:
*

FALLBACK:

# version 1

Here’s the file that loads the manifest (note that we’re using an HTTP-Routing/MVC system so the “Master” file is “/view/welcome-visitors”):

Smallest Federated Wiki

<!– –>

I am getting the cache to work fine…

MY DILEMMA IS that I (like what “nad” says):

I want something like this.
If the user is online, fetch pages from the server.
If the user is offline, fetch pages from the HTML5 App cache.
Can this be done with HTML5. Please comment

Also, make sure you look at the the cache in the browser…
In chrome, open console/element inspector, click on “Resouces” and probably at the bottom you’ll see “Application Cache”.

I’ll post back if I figure out how to do what nad was asking about.

Happy coding :-)

Cody

Cody says

wups… I guess it didn’t take my HTML.

dhanesh mane says

@judica thanks for the link, it really have nice information.

Thanks
Dhanesh Mane

Fred says

To have an online page and an offline page that both have access to the appCache is trickier than it should be, mostly because any pages pointing at the manifest will be cached implicitly (so your online page will be served from the cache, regardless of if you have specified in the FALLBACK section that it should redirect).

You could try adding a window navigator.onLine listener, but firefox will return true in many circumstances even if you have no connection (in fact I don’t know if it will ever return false). The solution I found relies on the fact that every time a page pointing at a manifest loads, it checks for a new manifest and if it can’t find the manifest (most likely because there is no internet connection) it sends an error.

So all you have to do is listen for this error, and redirect to the offline page:
window.applicationCache.addEventListener(‘error’, function(e){
// redirect to offline.htm or whatever
}, false);

it’s not perfect, but as far as I can tell, the only time you’re going to redirect incorrectly is is the manifest changes while the update is being run:
http://www.whatwg.org/specs/web-apps/current-work/multipage/offline.html#appcacheevents

Muhammad Adnan says

nad and @Cody: I’m looking the same thing

“” I want something like this.
If the user is online, fetch pages from the server.
If the user is offline, fetch pages from the HTML5 App cache.
Can this be done with HTML5. Please comment “”

sahil vig says

Is there a way to programmatically clear application cache?

Mohammad Gouse says

Hi ,

My Problem in Offline manifest is,

it properly working in Chrome, Firefox (Desktop) but not working in MAC or IPAD , and not getting any error in console, Once i unplugged my network connection and trying to open the page then getting error as “Safari cannot open the page because it is not connected to the internet”

Please help me to solve the issue………….

Join the discussion.

Some HTML is ok

You can use these tags:
<a href="" title="">
<abbr title="">
<b>
<blockquote cite="">
<cite>
<del datetime="">
<em>
<i>
<q cite="">
<strong>

You can also use <code>, and remember to use &lt; and &gt; for brackets.