Canonicalize JavaScript Libraries
Configuration
The 'Canonicalize JavaScript Libraries' filter is enabled by specifying:
- Apache:
ModPagespeedEnableFilters canonicalize_javascript_libraries
- Nginx:
pagespeed EnableFilters canonicalize_javascript_libraries;
in the configuration file.
Description
This filter identifies popular JavaScript libraries that can be replaced with ones hosted for free by a JavaScript library hosting service — by default the Google Hosted Libraries. This has several benefits:
- Most important, first-time site visitors can benefit from browser caching, since they may have visited other sites making use of the same service to obtain the libraries.
- The JavaScript hosting service acts as a content delivery network for the hosted files, reducing load on the server and improving browser load times.
- There are no charges for the resulting use of bandwidth by site visitors.
- The hosted versions of library code are generally optimized with third-party minification tools. These optimizations can make use of library-specific annotations or minification settings that aren't portable to arbitrary JavaScript code, so the libraries benefit from more aggressive optimization than can be provided by PageSpeed.
In Apache the default set of libraries can be found in
the pagespeed_libraries.conf
file, which is loaded along with pagespeed.conf
when Apache starts
up. It contains signatures for all the Google Hosted
Libraries. In Nginx you need to
convert pagespeed_libraries.conf
from Apache-format to Nginx
format:
$ scripts/pagespeed_libraries_generator.sh > ~/pagespeed_libraries.conf $ sudo mv ~/pagespeed_libraries.conf /path/to/nginx/configuration_files/
You also need to include it in your Nginx configuration by reference:
include pagespeed_libraries.conf;
Don't edit pagespeed_libraries.conf
. Local edits will keep you
from being able to update it when you update PageSpeed. Rather than editing it
you should add additional libraries to your main configuration file:
- Apache:
ModPagespeedLibrary 43 1o978_K0_LNE5_ystNklf \ //www.modpagespeed.com/rewrite_javascript.js
- Nginx:
pagespeed Library 43 1o978_K0_LNE5_ystNklf //www.modpagespeed.com/rewrite_javascript.js;
The general format of these entries is:
- Apache:
ModPagespeedLibrary bytes MD5 canonical_url
- Nginx:
pagespeed Library bytes MD5 canonical_url;
Here bytes
is the size in bytes of the library after
minification by PageSpeed, and MD5
is the MD5 hash of the
library after minification. Minification controls for differences in whitespace
that may occur when the same script is obtained from different
sources. The canonical_url
is the hosting service URL used to
replace occurrences of the script. Note that the canonical URL in the above
example is protocol-relative; this means the data will be fetched using the same
protocol (http
or https
) as the containing
page. Because older browsers don't handle protocol-relative URLs reliably,
PageSpeed resolves a protocol-relative library URL to an absolute URL based
on the protocol of the containing page. Do not use http
canonical
URLs in configurations that may serve content over https
, or the
rewritten pages will expose your site to attack and trigger a mixed-content
warning in the browser. Similarly, avoid using https
URLs unless
you know that the resulting library will eventually be fetched from a secure
page, as SSL negotiation adds overhead to the initial request.
Additional library configuration metadata can be generated with the help of
the pagespeed_js_minify
utility installed along with PageSpeed.
To use this utility, you will need a local copy of the JavaScript code that you
wish to replace. If this is stored in library.js
, you can generate
bytes
and MD5
as follows:
- Apache:
$ pagespeed_js_minify --print_size_and_hash library.js
- Nginx:
$ cd /path/to/psol/lib/Release/linux/ia32/ $ pagespeed_js_minify --print_size_and_hash library.js
If you're using the new
javascript minifier, add the --use_experimental_minifier
argument to pagespeed_js_minify
. If you're using the old minifier,
add --nouse_experimental_minifier
. (As of 1.10.33.0,
--use_experimental_minifier
is default. Previously,
--nouse_experimental_minifier
was.)
The default pagespeed_libraries.conf
includes hashes for both
the old and new minifiers.
This filter is based on the best practices of optimizing browser caching and reducing payload size.
Operation
In order to identify a library and canonicalize its URL, PageSpeed must of course be able to fetch the JavaScript code from the URL on the original page. Because library canonicalization identifies libraries solely by their size and hash signature, it is not necessary to authorize PageSpeed to fetch content from the domain hosting the canonical content itself. This means that it is safe to use this filter behind a reverse proxy or in other situations where network access by PageSpeed is deliberately restricted. Browsers visiting the site will fetch the content from the canonical URL, but PageSpeed itself does not need to do so.
Examples
You can see the filter in action at www.modpagespeed.com
on this
example.
If the HTML document looks like this:
<html> <head> <script src="jquery_1_8.js"> </script> <script src="a.js"> </script> <script src="b.js"> </script> </head> <body> ... </body> </html>
Then, assuming jquery_1_8.js
was an unminified copy of the jquery
library and a.js
and b.js
contained site-specific code
that made use of jquery, the page would be rewritten as follows:
<html> <head> <script src="http://ajax.googleapis.com/ajax/libs/jquery/1.8.3/jquery.min.js"> </script> <script src="a.js"> </script> <script src="b.js"> </script> </head> <body> ... </body> </html>
The library URL has been replaced by a reference to the canonical minified
version hosted on ajax.googleapis.com
. Note that canonical
libraries do not participate in most other JavaScript optimizations. For
example, if Combine JavaScript is also enabled,
the above page will be rewritten as follows:
<html> <head> <script src="http://ajax.googleapis.com/ajax/libs/jquery/1.8.3/jquery.min.js"> </script> <script src="http://www.example.com/a.js+b.js.pagespeed.jc.zYiUaxFS8I.js"> </script> </head> <body> ... </body> </html>
The canonical library is not combined with the other two JavaScript files, since this would lose the bandwidth and caching benefits of fetching it from the canonical URL.
If defer_javascript is enabled, and library
is not tagged with data-pagespeed-no-defer
,
the canonicalized library is deferred.
Requirements
Only complete, unmodified libraries referenced by <script>
tags in the HTML will be rewritten. Libraries that are loaded by other means
(for example by injecting a loader script) or that have been modified will not
be canonicalized.
Risks
You must ensure that you abide by the terms of service of the providers of the canonical content before enabling canonicalization. The terms of service for the default configuration can be found at https://developers.google.com/speed/libraries/terms.
The canonical URL refers to a third-party domain; this can cause additional DNS lookup latency the first time a library is loaded. This is mitigated by the fact that the canonical copy of the data is shared among multiple sites.
The initial request for a canonical URL will contain a Referer:
header with the URL of the referring page. This permits the host of the
canonical content to see a subset of traffic to your site (the first load of a
page on your site that contains an identified library by a browser that does not
already have that library in its cache). The provider should describe how this
data is used in its terms of service. The terms of service for the default
configuration can be found at
https://developers.google.com/speed/libraries/terms.
Again, this risk is mitigated by the fact that canonical libraries are shared
among multiple sites; a popular library is likely to already reside in the
browser cache.
Sites serving content on both http
and https
URLs must
use protocol-relative canonical URLs as shown above.
Fetching a library insecurely from a secure page exposes a site to attack.
Fetching a library securely from an ordinary page can increase load time due to
SSL overheads.
It is theoretically possible to craft a JavaScript file whose minified size and hash exactly match that of a canonical library, but whose code behaves differently. In such a case the library will be replaced with the canonical (widely-used) library. This will break the page that contains the reference to the crafted JavaScript.