A few days ago, this video hosted by metacafe popped up on digg, explaining how to increase site download times by tweaking your browser settings to increase connection parallelism. To explain why this works, let’s step back a bit to discuss how browsers manage server connections.
In building any application, developers are often required to make ‘utilitarian’ choices. Pretentiously paraphrasing Jeremy Bentham, ‘utilitarian’ describes an approach that ‘does the greatest good for the greatest number.’ Many times, sacrifices in performance are made for a subset of users so that the average expected performance of all users will be better.
Since all browsers were originally developed for a time when the vast majority of users were on bandwidth-constrained dial-up links, it made sense to restrict users to a small number of connections. The overhead of juggling many connections over dial-up makes it difficult for progress to be made servicing any of the individual requests. Also, web servers and proxy servers in that era were less robust, so keeping the per-browser connection pool small reduced the risk of overwhelming the network infrastructure.
To find a decent balance, IE and Firefox by default restrict users to 6 connections total and 2 connections per host for HTTP 1.1 connections. HTTP 1.0 is a slightly different story, but the benefits of persistent connections means that you should be (and probably are) using HTTP 1.1 anyway.
Of course, in the real world, these utilitarian decisions have a tendency to rot, outliving the time when they are relevant. Today, the majority of users have broadband connections, so client side bandwidth is not the gating factor. Typically, latency in retrieving individual objects is dominated by the time required to setup a connection and send a request. By increasing the number of concurrent connections, we can parallelize that cost and churn through the list of pending objects more quickly, leading to an increase in user perceived download time. “Lightning fast,†to echo the hyperbole of the metacafe clip.
Unfortunately, relying on your users to modify their browsers is not a reasonable optimization strategy, so what can the concerned developer do to tap into these gains?
Most sites only use one host, so requests are forced to share the 2 connections to that host. One effective strategy to improve parallelism is to spread your content out across multiple hosts. This is not hard to do as browsers look at the hostname, not the ip. That is, images1.yoursite.com and images2.yoursite.com would each be allowed 2 connections.
As a practical example, check out this sample application Buddy and I built for our workshop at Web Builder 2.0. The album thumbnails are by default all loaded from the parent host, so they all share from two connections. Below is a sample ‘waterfall’ chart of the objects loaded for this page, collected by Gomez’ external performance testing service.
The full version of the chart is available behind the link.
You can see from the graphic that only two connections (C0 and C2) are opened for musicstore.ajaxperformance.com. We are using HTTP 1.1, so we do not need to open a separate connection for each image, but we are still spending the majority of our time servicing object requests. The object times are dominated by the cost of requesting the images (the blue first byte time) and very little time is spent downloading the content (the red content time).
To improve performance, we created CNAMEs for images1.ajaxperformance.com, images2.ajaxperformance.com, and images3.ajaxperformance.com, all of which point back to our main host. This small snippet in our Rails album layout code spread each album image across a different host:
img_url = "/images/album_art/#{album.id}.jpg" if @perf_multiplex_images idx = (album.id % 3) + 1 img_url = "http://images#{idx}.ajaxperformance.com/images/album_art/#{album.id}.jpg" end
You can see the results by hitting the updated version. First load performance should be much better. As you can see from the below waterfall (again, the full chart is available behind the link), we are now using 6 connections to grab our images.
So what did that buy us? A lot, given the relatively simple change. Below is a head-to-head comparison of total page load time, collected over 24 hours from 8 locations around the country.
The average load time when using 2 connections is 7.919 seconds. The average load time when using 6 connections is 4.629 seconds. That’s a greater than 40% drop in page load time. This technique will work anywhere that you have a large block of object requests currently served by one host.
There is plenty of precedent for this approach in real world Ajax apps. To exploit connection parallelism, the image tiles at Google Maps are served from mt0.google.com through mt3.google.com. Virtual Earth also uses this technique.
You can also use this connection management approach to sandbox the performance of different parts of your application. If you have page elements that require database access and may be more latent than static objects, keep them from clogging up the 2 connections for image content by putting them on a subdomain. This trick won’t cause a huge improvement in the total load time of your page, but it can significantly improve the perceived performance by allowing static content to load unfettered.