6,953 reasons why I still let Google host jQuery for me
JavaScript, jQuery, Performance By Dave Ward. Updated December 16, 2010It’s been nearly two years since I wrote about using Google’s CDN to host jQuery on your public-facing sites. In that post, I recommended it due to three primary benefits that public CDNs offer: decreased latency, increased parallelism, and improved caching.
Though the post has been overwhelmingly well-received, concerns have been raised as to whether or not the likelihood of better caching is truly very significant. Since the efficacy of that benefit depends entirely on how many other sites are using the same CDN, it takes quite a bit of research to make an objective case either way.
I’ve never been happy about responding with vague answers. Caching probability is a valid concern and deserves to be taken seriously. So, I decided to cobble together an HTTP crawler, analyze 200,000 of the most popular sites on the Internet, and determine how many of those are referencing jQuery on Google’s public CDN.
Methodology

Having an idea of how many sites reference jQuery on Google’s CDN and how popular those sites are should lead to a more objective decision, one way or the other. Just a handful of top-ranked sites can prime as many browser caches as thousands of more obscure sites. Conversely, heavy coverage across a long-tail of moderately popular sites has the potential to prime just as many caches.
To measure the CDN’s coverage and how that coverage varies with site popularity, I decided to use Alexa as my source of sites to analyze. Alexa is far from perfect, but they make a free CSV file of their top ranked sites available and the aggregate across 200,000 sites smooths out most of Alexa’s issues.
To determine which of those sites use a public jQuery CDN, my crawler ran down Alexa’s list and downloaded the document at each site’s root. Then, I logged any script element with a src attribute that contained the word “jQuery”.
Inaccuracies
I’ll be the first to admit that my approach is fraught with inaccuracies:
- Alexa – Alexa itself isn’t a great ranking mechanism. It depends on toolbar-reported data and individual rankings must be taken with a grain of salt. However, I believe that aggregate trends across its top 200,000 sites represents a useful high-level view.
- HTTP errors – About 10% of the URLs I requested were unresolvable, unreachable, or otherwise refused my connection. A big part of that is due to Alexa basing its rankings on domains, not specific hosts. Even if a site only responds to www.domain.com, Alexa lists it as domain.com and my request to domain.com went unanswered.
- jsapi – Sites using Google’s jsapi loader and google.load() weren’t counted in my tally, even though they do eventually reference the same googleapis.com URL. Both script loading approaches do count toward the same critical mass of caching, but my crawler’s regex doesn’t catch google.load().
- Internal usage – It’s not uncommon for sites to pare their landing pages down to the absolute bare minimum, only introducing more superfluous JavaScript references on inner pages that require them. Since I only analyzed root documents, I undercounted any sites taking that approach and using the Google CDN to host jQuery on those inner pages.
At first, that may seem like an awful lot of potential error. However, the one thing all of these inaccuracies have in common is that none of them favor the case for using a public CDN. Playing the averages, I expect that the actual usage numbers are at least 10% higher than what I found.
So, in terms of making a case for the CDN, this analysis is extremely conservative.
Analysis
By casting a wide net with the regex and logging any script reference that contained the word “jQuery”, I was able to construct ad-hoc queries to answer a variety of questions. For example, how many top 200,000 sites use the Google CDN to host jQuery UI for them?
SELECT count(*) FROM Results WHERE Reference LIKE '%googleapis%jquery-ui.min.js'
Answer: 989
Want to know how many top 1,000 sites use the Microsoft CDN for any jQuery-related script?
SELECT COUNT(*) FROM Results WHERE Reference LIKE '%ajax.microsoft%jquery%' AND Rank <= 1000
Answer: 1 (Microsoft.com)
My findings
Without further ado, across the 200,000 sites that I analyzed, this is what I found:
- 47 of the Alexa top 1,000 include a Google CDN reference.
- 99 of the Alexa top 2,000 reference jQuery on the Google CDN.
- 6,953 of the top 200,000 sites include a script reference to some version of jQuery hosted on Google’s CDN.
Just within the top thousand or so sites, I found the Google CDN being used to host jQuery for very high-traffic sites including Twitter, TwitPic, SlideShare, Break, Stack Overflow, Woot, Posterous, SitePoint, Foursquare, FAIL Blog, Stanford, and the jQuery site itself. These sites alone are priming tens, if not hundreds, of millions of browser caches with the Google CDN’s copy of jQuery.
Not only that, but popular sites using the Google jQuery CDN span a diverse range of genres and demographics. While I might theorize that a minority of my readers are also regular Break.com and Foursquare visitors, I cannot possibly make that claim for Twitter and Stack Overflow. A non-trivial amount of my traffic is referred directly from those sites, and enjoys a no-HTTP-request cache hit for the jQuery reference here on my site.
In fact, I found that most any genre a site falls within, there’s at least one site near the top of Alexa’s rankings that uses Google’s jQuery CDN, priming caches for all of the smaller sites in that niche.
Disproving a theory
Going into this, I expected that usage of a public CDN would be more common as a site’s Alexa rank increased. This wouldn’t necessarily be desirable since high-traffic sites referencing the CDN improve the caching situation for all of us, but I thought it likely.
Since most large-scale sites already host static assets on CDNs, I reasoned that they would be less likely to use a shared, public CDN like Google’s. On the other hand, I thought that smaller sites would be more eager to take advantage of the free CDN service, which has become easy for even non-technical site owners to implement.
However, what I found was a nearly dead-even distribution across the 200,000 sites I sampled. There were some variations, but it appears that larger sites are just as happy to use Google’s bandwidth as anyone. This is a great result for the rest of us. When popular sites like Twitter, StumbleUpon, and Stack Overflow seed their myriad users’ caches, it’s more likely that smaller sites will benefit from a no-HTTP-request jQuery load.
Issues
I’m optimistic about the results of my research, but the analysis did reveal some issues that can’t be ignored. I hope identifying these sore spots can raise awareness and eventually improve the situation.
Version fragmentation
One obstacle in the way of optimal caching is that sites must reference exactly the same CDN URL in order to obtain the cross-site caching benefit. Thankfully, jQuery tends to quickly settle on a stable version after each major release, and that version is relatively long-lasting.
Unfortunately, I did find a handful of sites still referencing odd versions of jQuery, such as 1.3.0 and 1.4.1. That mistake wasn’t very common, but even one popular site referencing an odd version of jQuery is one too many.
The most notable offender is Twitter. For reasons I can’t fathom, their jQuery reference is to 1.3.0. I assume that’s being updated to 1.4.2 as part of the #newtwitter revamp currently underway, but I was surprised to find that reference on a site under the stewardship of so many developers.
The takeaway here is to keep your site’s CDN reference updated to the latest compatible version of jQuery. Even if you have some legitimate reason to avoid the upgrade from 1.3 to 1.4, at least be sure that you’re referencing 1.3.2.
Specificity is crucial
After fragmentation, the next most common mistake I found was using the “latest version” references that some CDNs offer. The “latest version” reference allows you to request either version 1 or 1.x and automatically receive the latest matching 1.x.y version.
For example, at the time of writing, this reference returns jQuery 1.4.2:
http://ajax.googleapis.com/ajax/libs/jquery/1/jquery.min.js
And, this reference returns jQuery 1.3.2:
http://ajax.googleapis.com/ajax/libs/jquery/1.3/jquery.min.js
You should never do this in production.
In order to insure that references to a “latest version” remain current when jQuery is updated, these requests are necessarily served with a very short expires header. Not only does this break the Internet-wide caching benefit between your site and others, the “latest version” reference’s short expires header likely even makes the CDN less optimal than serving jQuery from your own site!
Worse, you’re giving a third party permission to transparently change one of your site’s fundamental dependencies without your approval or interaction when a jQuery update occurs. This is not a good idea.
A notable offender here is jQuery.com itself. The site currently references the Google CDN for jQuery, but unfortunately references the “latest” 1.4 version instead of 1.4.2 specifically. Not only is that slower than necessary for repeat jQuery.com visitors, but imagine how many browser caches they could be priming if they were referencing 1.4.2 specifically!
The Microsoft CDN
Since I’m a fan of public CDNs, I was happy to see Microsoft start hosting MicrosoftAjax.js and the now-defunct ASP.NET Ajax Library on their CDN. However, I’m disappointed to see them pushing it as a solution for hosting jQuery and jQuery UI for two reasons:
- Cookies – Because Microsoft’s CDN falls under the Microsoft.com domain, every request to it needlessly includes the plethora of cookies that other Microsoft.com subdomains set. In my case, this weighs in at about 3kb of superfluous cookie data that must be transmitted along with every request to the Microsoft CDN.
- Popularity – Far fewer public-facing sites use the Microsoft CDN to serve jQuery for them: I found only 49 sites in the entire top 200,000 that reference Microsoft’s copy of jQuery. We can speculate endlessly about why that is, but the “why” is unimportant. The Google CDN has such a vast caching advantage at this point, using the Microsoft CDN for jQuery is a needless performance penalty.
I have friends at Microsoft and hesitated to point out these issues with using Microsoft’s CDN for jQuery and jQuery UI. Moreover, I do commend them for providing the Microsoft-specific scripts on a public CDN.
Ultimately though, I’d be remiss not to mention these drawbacks, since so many .NET developers seem eager to use Microsoft’s CDN out of misplaced brand loyalty. It’s a shame for .NET developers to unwittingly contribute to the aforementioned fragmentation issue, while simultaneously missing out on the caching benefit that Google’s more popular CDN offers.
Conclusion
If you’re using jQuery on a public-facing site, use the Google CDN to host it. This is not simply a theoretically good idea, but is objectively, quantitatively justified. Better yet, the likelihood of a cache hit is only growing.
Similar posts
What do you think?
I appreciate all of your comments, but please try to stay on topic. If you have a question unrelated to this post, I recommend posting on the ASP.NET forums or Stack Overflow instead.
If you're replying to another comment, use the threading feature by clicking "Reply to this comment" before submitting your own.
12 Mentions Elsewhere
- Christians dagbok – 2010-09-21 | En sur karamell
- Introduction à HTML5 Boilerplate « MKLog.fr
- Five steps to cleaning up that Javascript and CSS in your web application
- Tweets that mention 6,953 reasons why I still let Google host jQuery for me - Encosia -- Topsy.com
- Should I use Google’s CDN-hosted jQuery? | Gabriel's Blog
- Best of 2010 - Wolf’s Little Store
- Content Delivery Network Hosted API's With a Local Fallback I'd Rather Be Coding - I'd Rather Be Coding
- Which jQuery CDN Should You Use: Google, Microsoft, or jQuery.com?
- JQuery: Using the right CDN | Thought Stuff
- AaronHardy.com » JavaScript Architecture: jQuery | Architecture
- Cache Priming : WBarton PostGRAD
- jQuery & others vs. Vanilla Web | Louis-Rémi



Excellent researching on the topic Dave. I would’ve liked a snazzy 3D pie chart to see the numbers visually – we likes eye candy we does. :)
I’m banging this drum whenever there’s anyone who might have the remotest interest and having your blog post to refer to certainly helps.
Thank you
Bernhard
Did u just validate everything and forgot that the google cdn is inaccessible in some countries? Like I hear, iran?
It’s easy enough to mitigate those blocked regions, overzealous firewalls, NoScript users, etc with a fallback technique, like this one: http://weblogs.asp.net/jgalloway/archive/2010/01/21/using-cdn-hosted-jquery-with-a-local-fall-back-copy.aspx
Or, the post that Mark linked in the next comment.
Hey, cdn blocked iran but jquery loads very well…
Nice to see these figures but I wonder whether we are relying too much on Google and thereby introducing a single point of failure.
I wrote about how to mitigate against this in this blog post :
http://happyworm.com/blog/2010/01/28/a-simple-and-robust-cdn-failover-for-jquery-14-in-one-line/
Mark
Tremendous thanks to you for providing such useful information, ESPECIALLY why NOT to use the http://ajax.googleapis.com/ajax/libs/jquery/1/jquery.min.js reference to the latest version, e.g.
· Short expires header likely makes the CDN likely less optimal than serving jQuery from your own site
· Giving a third party permission to transparently change one of your site’s fundamental dependencies without your approval or interaction when a jQuery update occurs
· Breaks the Internet-wide caching benefit between your site and others, i.e. not priming other browser caches
Intranets. That’s the reason why I can’t reference the CDNs. Anyone know any workarounds or solutions for this?
If your app is deployed on an Intranet and consumed across an internal network, I’d say ignore the CDN. Using a public CDN like Google’s is usually only optimal on public-facing sites.
Nice to have an objective opinion on this. Meanwhile Google continues to contradict itself on Page Speed. Minimise DNS look ups AND use a CDN for static content.
@Dave in the Intranet case… what about multiple webapplications, what is the better way to made an own CDN? Right now I have an own CDN in my Intranet, but which are the best practices?
Thanks in advance
You can definitely do well by centralizing common scripts. Your internal users will gain the same cross-”site” caching benefits if you have several high-traffic web apps that use the same scripts. In that environment, you don’t even need to limit yourself to just jQuery, since you know which apps share which scripts and to what extent they do.
It’s still important to set a far-future expires header and be sure HTTP compression is enabled though (mostly the expires header).
Beyond that, the specifics would depend on your network topology.
I’ve found that not using a CDN, but concatenating all javascript into a single script load is by far a better way to serve javascript. It doesn’t matter if you use a CDN if you are loading 20 script files. If you load only one script file with all your scripts minified and gzipped, from your own server, then you will be saving a lot of round trips and your load time will increase.
Actually, using an incomplete version number isn’t as bad as you make out. It does involve a round trip every hour, but it’s only an If-Modified-Since request so the browser’s cached copy is still used. And if you specify e.g. 1.4 rather than 1.4.2 then you should get fixes and performance improvements automatically without risking a breaking API change, which for most sites could be considered a good thing :)
I’d like to see google actually push down the most popular libraries while the browser is idle. For example, on a search results page, perhaps on scroll, start downloading. On Youtube, when a video is fully buffered, start downloading.
I am of the opinion that there should be a way for the web development community to work with browser vendors to guarantee that popular AJAX libraries are already available locally. Whether this would be something like root SSL certificate authorities, or more like the way doctype definition URLs were supposed to work, is something that vendors would have a better idea about than I, but that doesn’t mean the idea is solid… then for users of good browsers, your jQuery would have a 100% cache hit rate!
Mark, there is still an HTTP request made to check if the browser has the most recent version. It’s not file size we are concerned about here, it’s the request count.
@Alan, it does indeed involve one extra HTTP request per hour, but I would still posit that this is not as much of a big nasty as most blog posts on this subject tend to imply.
jQuery seems to load quicker off our CDN than Google Directly. Could have something to do with us being in Australia though.
Dave,
Like Mark Boas above, I think you should do an article on jQuery CDN failover techniques. I did my own thing and it’s been working great:
http://jonathan-oliver.blogspot.com/2010/09/jquery-cdn-failover.html
But I’d hope that you’d have some additional insights.
@Dave Thanks so much for your contribution!
Encosia simply rocks!
Great article. I’m interested in your webcrawler. In which language did you implement it and are you sharing the code?
The crawler is a C# console app that logs to a SQL Server Express database. It’s primitive and the source isn’t in a state that I’d like to publicly share (yet), but I’d be happy to send you a copy of it if you want to run it yourself.
Yeah that would be great. Please send it to the e-mail I used to subscribe to these comments. Thanks a bunch.
Another benefit is reduci bandwith cost !
Interesting stats & advice.
I’m also curious to know how many of those top 200,000 sites used jQuery in any form
That’s more difficult to determine accurately. I can query for specific file names or patterns, but I can’t detect whether or not a site is including jQuery in a combined bundle.
To get an accurate count, I’d ideally need to load the pages into a browser engine and test to see if the jQuery object is defined or not after window.onload.
I don’t know exactly how they collect their statistics or how accurate they are, but BuiltWith does have some data on jQuery adoption: http://trends.builtwith.com/javascript/JQuery
OK, but how about code.jquery.com?
I believe the underlying hosting for code.jquery.com is either Media Temple or EdgeCast, which are both fast services. However, the cross-site caching advantage all these thousands of sites give Google’s CDN is hard to beat. Once you get to the point of rarely needing to download the file, the speed of the CDN takes a back seat to its popularity.
This is a great and interesting research! Thank you :)
Do you know about Cached Commons http://cachedcommons.org which uses gitghub as a CDN?
In a comment, Lance Pollard told me
“Doing a quick test in Google Chrome, reloading the Google Ajax API jQuery 1.4.2 min and Cached Commons’ jQuery 1.4.2 min (copy-pasted from their version, so they’re the same), the Google Ajax API version took ~90-95ms to load, Cached Commons took ~60-75ms! ”
From – http://viatropos.com/blog/github-as-a-cdn/
I’d be interested in your view.
Thanks
~20ms wouldn’t be worth sacrificing the cross-site caching advantage Google’s CDN has at this point (with all these thousands of sites pre-caching jQuery for you). I also noticed that the Cached Commons copy of jQuery is only served with a +1 day expires header, which is too short.
I know I’m a bit late to the party here, but something else I noticed what I was looking at this before is that Microsoft apparently uses their own internal JS crusher instead of mirroring the official jQuery that’s been crushed with Closure Compiler. Take a look at this URL http://ajax.microsoft.com/ajax/jquery/jquery-1.4.2.min.js and then this URL http://ajax.googleapis.com/ajax/libs/jquery/1.4.2/jquery.min.js and you will see that the responses are clearly different. Between their superfluous response headers — which include a useless P3P header as well as an indication that it’s being served by ASP.NET — and their poor recompression of the code, the size of the response is 31,143 bytes vs Google’s 24,678 bytes. That represents a 25% size difference in something that you’d expect should be the same!
I’m a fan of Microsoft’s development tools because of the quality of workmanship, but I can’t say the same thing in this case. It makes me sad that they didn’t pay attention to these details.
@Andrew Mattie – this is a very good catch :)
The NEW Twitter seems to be using combining scripts and using it’s own CDN now. http://a1.twimg.com/a/1287512878/javascripts/phoenix.bundle.js
I looked quite closely into the relative performance of the MS and Google CDNs in the UK for jQuery.
My conclusions were:
* They’re both ~20ms away from us here (i.e. they’re served in London)
* They can both establish a TCP connection very quickly (one RTT).
* MS has an extremely short HTTP keep alive time, so you’re much less likely to avoid the cost of opening a new TCP connection than with Google.
* The MS server is slow serving on a freshly opened TCP connection
* Both servers performed *very* similarly on warmed-up connections if you avoid sending the MS cookies.
* The MS JQuery file is considerably bigger than the Google one (mentioned above)
* The 3K of MS.com cookies you will probably have to send with the request will take as long to send as the 30K response takes to receive on a typical 10:1 ADSL line
* Although it’s much more subjective, I have the feeling that there are more bad outliers (multiple seconds) on the MS CDN.
I have discussed some of this with the CDN folk @ MS, who have been interested but only just now seem to be appreciating the cookie issue despite people writing about it for months.
Unsurprisingly, we’re using the Google CDN. MS also seem to have lost interest in providing the vsdoc files for the last couple of versions of jQuery, which is depressingly exactly what I thought would happen a couple of years after all the initial fanfare.
One aspect of loading dependencies off the Google / Microsoft CDNs is that you’re effectively giving them a very accurate way of measuring how much traffic you get.
If you’re building a site that’s never going to get acquired by GOOG / MSFT, or in no way competes with them, then by all means, DO use their CDNs … if you are, you probably don’t want to share accurate traffic data with GOOG / MSFT.
That’s a common concern about using their CDN (and it’s wise to be mindful about privacy), but it’s a misconception.
The far-future expires header means that they receive far, far fewer requests than the total number of page views containing a reference to the CDN. After the first request, a given browser can conceivably never make another request for that file for an entire year, regardless of how many CDN references it encounters. With a far-future expires header, there isn’t even an HTTP request to check for a 302 “not modified” state.
Even if they are trying to use the CDN to collect metrics, that spotty traffic data would be impossibly inconsistent in terms of estimating any given site’s traffic. Better yet, the more sites that use the CDN, the more that unpredictability would be magnified.
Agreed, the data would be spotty at best.
However, I am currently contracting at one of Google’s competitors, and let’s just say that we’re very mindful of not leaking sensitive data, even if it is grossly inaccurate ;)
The only problem I find with using google’s CDN is just that it’s slow. On all the pages I’ve ever used it on, the load of jquery or prototype from there is usually the single slowest element on the page (at least according to Safari’s inspector, page speed and yslow). I can usually better than halve the time by hosting it myself. I’m aware that this may not always be the case, and that google’s hosting is probably more reliable than mine (not that gdocs sets a good example on that front), especially under load, but still, that’s what I’m seeing.
There are a couple of additional considerations when using these CDN’s:
* Overhead of establishing a new connection
* Effect of TCP slow start
These will obviously be negated by the increasing probability of a cache HIT; however, if the latency to your own domain is similar, downloading from these CDN’s will invariably be slower.
This was covered during the latest velocity conference and summarized here: TCP and the Lower Bound of Web Performance
One point in favour of Microsoft’s CDN is that it includes a broader collection of jquery plugins than Google does. For example, jquery’s template plugin is not available on Google’s CDN.
@Richard Stephens,
CDNJS at http://www.cdnjs.com/ aims to host any popular javascript libraries. While I don’t see the jQuery template plugin there yet, they encourage you to fork & pull request their collection — it sounds like it would be a good fit, and a great project. [I am not affiliated; I just heard about them on a podcast.]
I think its best to consider other sites that your visitors are likely to visit and look at what CDN URL they are referencing and do the same. I reference the same URL as my main competitor because it is likely that users have visited their site.
Also if you have a specific source of traffic that references CDN hosted jquery, look at theirs and reference the same one. That’s what we do at Flatmate Rooms.
Also another reason for fragmentation of cache hits is https vs. http.
My opinions on using a public CDN (namely google’s) for jQuery have just changed thanks to this article. I used to liken it to leeching off another’s server and not taking care of your own responsibility… now I see it’s much better for all of us.
Awesome thoughts, awesome article, awesome result.
Thank you much, sir!
Ryan
For what it’s worth, Microsoft moved their CDN to its own domain name ajax.aspnetcdn.com so it wouldn’t require *.microsoft.com cookies: “The CDN used to use the microsoft.com domain name and has been changed to use the aspnetcdn.com domain name. This change was made to increase performance because when a browser referenced the microsoft.com domain it would send any cookies from that domain across the wire with each request. By renaming to a domain name other than microsoft.com performance can be increased by as much to 25%. Note ajax.microsoft.com will continue to function but ajax.aspnetcdn.com is recommended.” – from http://www.asp.net/ajaxlibrary/cdn.ashx
I prefer to use Google’s, but because I sometimes develop on the road (i.e. offline) I always use this code (which loads Google’s jQuery, if it can’t then it loads the local one):
res/js/jquery-1.9.1.min.js is the path of the jQuery file relative to index.php.