It’s no secret that I’m a proponent of using a shared CDN to host jQuery. As more and more sites take advantage of public CDNs for their jQuery reference, the cross-site caching benefit is becoming almost a given. However, there are a couple ways that even I recommend against using these public CDNs.
With the impending policy change on hotlinking copies of jQuery hosted on jQuery.com, I expect that at least several sites will be migrating their hotlinked script references to one of the public CDNs soon. So, I think this is a good time to address one CDN-related usage mistake that I’ve seen an uptick in lately.
Firesheep and SSL
By now, you’ve probably heard about last year’s release of Firesheep, a Firefox addon that sniffs out HTTP session cookies transmitted in cleartext, and the turmoil that it caused. Seeing someone hijack a stranger’s Twitter or Facebook identity with a single click is enough to make anyone more conscious about the security of their browsing habits.
Traditionally, most sites have used encrypted connections for authentication and sensitive information, to avoid transmitting passwords or private data in cleartext, but they generally avoid forcing SSL connections on most other pages. This lack of site-wide support is usually attributed to the (perceived) server-side overhead of encrypting every connection and the tedious mixed content warnings that browsers display when HTTPS pages contain references to unencrypted content.
However, with SSL certificates available for cheaper than ever, Firesheep having raised so much awareness, and the prevalent use of unsecured WiFi networks in public settings, many sites have responded by offering SSL encryption for pages on their entire sites.
Unfortunately, this recent proliferation of SSL usage has at least one performance drawback that isn’t necessary obvious at first glance.
HTTPS and mixed content
Different browsers react to mixed HTTP and HTTPS content with varying degrees of severity, with Internet Explorer being the most hostile, but every browser displays some type of warning by default. If you’re curious about how your browser handles the situation, navigate to https://encosia.com.
Visiting my site via HTTPS retrieves the page’s content via a legitimately secured connection, encrypted with a valid SSL certificate. The catch is that its HTML contains insecure references to content such as CSS and images, which triggers the dreaded mixed content warning. The result is especially painful in Internet Explorer:
It’s obvious why avoiding the mixed content warning is an absolute necessity for any serious public-facing site.
An obvious solution
In cases where a given page may potentially be served via either HTTP or HTTPS, path-relative URLs can be used to automatically choose the right protocol for on-site content. For example, references such as
/css/bar.css will automatically adapt to the correct protocol based on what was used to load the underlying document containing them.
Utilizing off-site resources like advertisements, third-party widgets, and CDN hosted libraries can be a bit more frustrating though. Because their fully qualified URLs specify a protocol, it’s easy to run afoul of the mixed content rule when pages are available through both HTTP and HTTPS protocols. Even a single, innocuous HTTP reference is enough to completely break a page served via HTTPS.
Since the inverse case – secure references on an unsecured page – isn’t subject to any obvious penalties, a common solution is to simply use HTTPS exclusively when linking to off-site resources that support it. On the secure pages, the HTTPS reference avoids a mixed content warning, and it still appears to work fine on HTTP pages as well. On sites where a given page might be viewed using either HTTP or HTTPS, assuming HTTPS can be the quickest, easiest remedy to the mixed content problem.
Unfortunately, that approach is burdened by a significant performance drawback on unsecured pages, which may not be readily apparent.
SSL == Super Slow Loading?
Using the secure reference everywhere seems like a workable solution, but there’s a major problem with over-using SSL for cacheable, static resources (such as jQuery). For the same reasons that browsers require those assets to be encrypted in the first place, most browsers default to not caching files to disk if they’ve been retrieved via SSL.
Worse, even if the user has a locally cached copy of jQuery sitting on disk that was requested from Google’s CDN via HTTP, their browser will not utilize that local copy when it encounters an HTTPS reference to the same resource on the same server.
In other words, this URL:
Is entirely different than the following one, as far as a browser is concerned, thus the two are not subject to the sizable cross-caching benefit that comes with using Google’s CDN:
The result is that using HTTPS references to Google’s CDN will result in under-optimized caching when used on regular HTTP pages. Though you must use secure reference on pages that are secure themselves, you should avoid HTTPS references on pages that don’t require them.
Update: As several people have pointed out, the Google CDN does serve its assets with a Cache-Control header that allows most browsers to cache its copies of jQuery to disk. However, that doesn’t help mitigate the cross-site caching issue. A local copy that was originally requested via HTTP cannot be used as a cache hit when the browser later encounters an HTTPS reference to the same file (and vice versa). Two separate copies of the file will be stored and each treated as distinct resources.
A better solution
It’s not exactly light reading, but section 4.2 of RFC 3986 provides for fully qualified URLs that omit protocol (the HTTP or HTTPS) altogether. When a URL’s protocol is omitted, the browser uses the underlying document’s protocol instead.
Put simply, these “protocol-less” URLs allow a reference like this to work in every browser you’ll try it in:
It looks strange at first, but this “protocol-less” URL is the best way to reference third party content that’s available via both HTTP and HTTPS.
On a page loaded through regular, unencrypted HTTP, script references using that URL will be loaded via HTTP and be cached as normal. Likewise, on a secure page that was loaded via HTTPS, script references targeting that protocol-less URL will automatically load the script from Google’s CDN via HTTPS and avoid the mixed content warning.
Thus, using the protocol-less URL allows a single script reference to adapt itself to what’s most optimal: HTTP and it’s full caching support on HTTP pages, and HTTPS on secured pages so that your users aren’t confronted with a mixed content warning.
I probably could have boiled this post down to a couple of sentences, but I hope you found the underlying “why” useful. Just saying that SSL resources aren’t cached isn’t nearly as interesting as understanding why, and how that relates to the mixed content issue.
Perhaps a big part of the problem is that the Google AJAX Libraries developer guide was recently updated to list only HTTPS URLs. Anyone who copy/pastes one of those URLs into their script reference without knowing better will be impacted by the loss of disk caching and cross-site caching, which is unfortunate. If anyone reading this has the right contacts, it would be great if that page could be updated to display the protocol-less URL for all of those libraries instead.
I hope if you see someone using a fixed HTTPS reference to a resource like jQuery on the Google CDN, you’ll point them here and hopefully help speed up the web just a tiny bit for everyone.
9 Mentions Elsewhere
- Cripple the Google CDN’s caching with a single character « Laboratory B
- Tweets that mention Cripple the Google CDN’s caching with a single character - Encosia -- Topsy.com
- Links for 2011-01-22 — Business Developer Talk
- Cutting the Google Analytics Script in Half « Of code and color
- Compilado de enlaces programación « Programación – por droope
- Protocol-less URLs | Sure Fire Web Services Inc.
- Coolest web development trick I’ve learned in a long time | Chainsaw on a Tire Swing
- What’s the Protocol? – Code Thug
- Massive Startup Resource List – 1000+ links | Justin Neilon