Extend Cache
Configuration
The 'Extend Cache' filter is enabled by specifying:
- Apache:
ModPagespeedEnableFilters extend_cache
- Nginx:
pagespeed EnableFilters extend_cache;
in the configuration file. This is equivalent to enabling all three
of extend_cache_images
, extend_cache_scripts
,
and extend_cache_css
.
Also see: extend_cache_pdfs.
Description
'Extend Cache' seeks to improve the cacheability of a web page's resources without compromising the ability of site owners to change the resources and have those changes propagate to users' browsers.
This filter is based on the best practice to optimize caching, as applied to the browser.
Operation
The 'Extend Cache' filter rewrites the URL references in the HTML
page to include a hash of the resource content (if
rewrite_css
is enabled
then image URLs in CSS will also be rewritten). Thus if the site
owners change the resource content, then the URL for the rewritten
resource will also change. The old content in the user's browser
cache will not be referenced again, because it will not match the new name.
The 'Extend Cache' filter also rewrites the HTTP header to extend the
max-age
value of the cacheable resource to 31536000 seconds,
which is one year.
For example, for the following HTML tag/HTTP header pair:
HTML tag : <img src="images/logo.gif"> HTTP header: Cache-Control:public, max-age=300
PageSpeed will rewrite these into:
HTML tag : <img src="images/logo.gif.pagespeed.ce.xo4He3_gYf.gif"> HTTP header: Cache-Control:public, max-age=31536000
PageSpeed uses the origin cache time-to-live (TTL), in this case 300 seconds, to periodically re-examine the content to see if it's changed. If it changes, then the hash of the content will also change. Thus it's safe to serve the hashed URL with a long timeout—PageSpeed uses one year.
If the site owners change the logo, then PageSpeed will notice within 5 minutes and begin serving a different URL to users. But if the content does not change, then the hash will not change, and the copy in each user's browser will still be valid and reachable.
Thus the site owners are still in complete control of how rapidly they can deploy changes to the site, but this does not affect the effectiveness of the browser cache. Decreasing the TTL only affects how often PageSpeed will need to re-examine the resource.
It should be noted that cache extension is built into other PageSpeed filters as well. All filters that rewrite resources include a content-hash in the generated URL, and serve the resource with a 1-year TTL. The purpose of this filter is to extend cache lifetimes for all resources that are not otherwise optimized.
Example
You can see the filter in action at www.modpagespeed.com
for
cache-extending resources
in HTML and
in CSS.
Limitations
Cache extension is only applied to resources that are publicly
cacheable to begin with. Cache extension is not done on resources
that have Cache-Control: private
or Cache-Control:
nocache
.
This can be overridden with:
- Apache:
ModPagespeedForceCaching on
- Nginx:
pagespeed ForceCaching on;
This switch is intended for experimental purposes only, to help evaluate the benefit of cache extension against the effort of adding cache-control headers to resources. Live traffic should not be served this way.
The following configure file fragment demonstrates how to configure caching headers in Apache. This is how the mod_pagespeed_example directory is set up.
# These caching headers are set up for the mod_pagespeed example, and # also serve as a demonstration of good values to set for the entire # site, if it is to be optimized by mod_pagespeed. <Directory /var/www/mod_pagespeed_example> # Any caching headers set on HTML are ignored, and all HTML is served # uncacheable. PageSpeed rewrites HTML files each time they are served. The # first time mod_pagespeed sees an HTML file, it generally won't optimize it # fully. It will optimize better after the second view. Caching defeats this # behavior. # Images, styles, and JavaScript are all cache-extended for # a year by rewriting URLs to include a content hash. mod_pagespeed # can only do this if the resources are cacheable in the first place. # The origin caching policy, set here to 10 minutes, dictates how # frequently mod_pagespeed must re-read the content files and recompute # the content-hash. As long as the content doesn't actually change, # the content-hash will remain the same, and the resources stored # in browser caches will stay relevant. <FilesMatch "\.(jpg|jpeg|gif|png|js|css)$"> Header set Cache-control "public, max-age=600" </FilesMatch> </Directory>
The equivalent configuration for Nginx would be:
# Make sure this goes after the .pagespeed. location regexp in your # configuration file so that .pagespeed. resources don't get this header # applied. location /mod_pagespeed_example { location ~* \.(jpg|jpeg|gif|png|js|css)$ { add_header Cache-Control "public, max-age=600"; } }
Risks
This filter is considered low risk. The rewritten URL will have a different name
than that of the original URL, however, so JavaScript that uses URLs as
templates can stop working. For example, consider a site that
has <input type=image src="button.gif">
and runs JavaScript
that turns button.gif
into button-hover.gif
when the
user hovers over the button. With cache extension enabled, or any filter that
changes the URLs of images, PageSpeed would replace the HTML fragment with
something like <input type=image
src="button.gif.pagespeed.ce.xo4He3_gYf.gif">
. If the script was
coded as "insert '-hover' before the final '.'" then it would construct an
invalid hover URL of button.gif.pagespeed.ce.xo4He3_gYf-hover.gif
.
If this is a problem on your site, consider In-Place
Resource Optimization.
When applied to JavaScript files, this filter is sensitive to
AvoidRenamingIntrospectiveJavascript
. For example,
a JavaScript file that
calls document.getElementsByTagName('script')
will not be
cache-extended.