Ajax Performance
A blog by Ryan Breen of Gomez
The WebKit team makes the case for preloading
March 24, 2008 on 7:13 am | In ajax, http | No CommentsOver at Surfin’ Safari, Antti Koivisto explains the preloading features in the latest WebKit nightlies. Antti begins by documenting the dominance of latency in determining total page load time, focusing on the slowdown caused by the blocking behavior of modern browsers while handling external scripts. As we’ve discussed here in the past, this has the effect of serializing object loads resulting in a total page load time that increases linearly with increases in network latency.
The new preloading feature available in WebKit nightlies attempts to maintain network parallelization even while the parser is blocked waiting for an external script to load. To achieve this, a separate parser is created to move through the remainder of the page, queuing up any additional objects to load. Scripts and stylesheets are also moved to the head of the queue of pending objects.
The net result for end users is a faster page load:

It should be noted that IE8 promises a similar improvement to script load parallelization, as discussed by Steve Souders a few weeks back. I would guess that the underlying implementation is similar to that used by the WebKit team.
Testing IE8’s Connection Parallelism
March 16, 2008 on 7:51 pm | In ajax | 6 CommentsA few weeks ago, I discussed IE8’s improved connection parallelism, specifically the increase from 2 concurrent connections per host to 6. One open question was the total number of connections allowed — my speculation was that the IE team would stick with a max of 6 rather than triple that value as well.
I was wrong. The new max is an astonishing 18 (!) concurrent connections:

That is some serious parallelism, and it has significant implications for application performance.
In December of 2006, I discussed the CNAME trick for circumventing browser connection limits, using 3 hostnames to serve images to trick the browser into using all available connections. At the time, that was 6 for IE. The above capture from IBM Page Detailer confirms 18 concurrent connections in IE8.
As expected, IE8’s handling of the unoptimized version, where only one hostname is used, is comparable to the performance of the optimized page in previous IE versions:

As an aside, the out of the box optimization provided by IE8 is actually slightly faster than the CNAME trick applied to previous IE versions as it does not incur any hostname resolution cost when establishing the first connections. Both examples would use 6 total concurrent connections, and IE8 should be equal to or faster than optimized connection management in previous versions.
But what about IE8 against a page optimized for connection parallelism? If 6 concurrent connections is good, 18 should be terrific, right? Not so fast. While the Page Detailer captures above show some improvement in the 18 connection version, point in time metrics can only tell us so much. What we need is a tool that can collect a statistically significant sample of performance data using both 6 and 18 connections to see if any trends shake out.
For this analysis, I used a hosted performance testing solution from Gomez, my employer. This is the same tool used in my original connection parallelism article. I ran my tests in IE8 compatibility mode, mirroring the new connection levels. As before, one test is against the default (1 host) page, and one test uses the CNAME trick (3 hosts) for greater connection parallelism. The results surprised me:

This aggregate data is made up of hundreds of tests taken from 7 locations in the US over the last 14 hours. The same locations were used for both tests. The “IE8 Parallelized” test, which uses 18 connections, has a much higher standard deviation and a higher average test time than the 6 connection “IE8 Default” test. What gives?
The answer appears to be sporadic connection hangs. The median response time for the parallelized page is lower than the default page, but a higher incidence of outliers skews the median and leads to the increased variability. Looking at the outliers, I typically see a section of the page load that looks like this:

Here we see 2 object downloads taking more than 8 seconds to complete. The average response time for an entire page is around half of a second, so this is a huge outlier. I see these outliers on between 5 and 10% of the test runs for the 18 connection page, but I never seen any comparably high outliers for the 6 connection version.
Below is a revised version of the test averages taken by removing outliers:

Note that the parallelized version is now consistently faster than the default. As expected, the outliers are responsible for the counterintuitive poor performance of the parallelized page.
I suspect that my hosting provider (Dreamhost) simply can’t keep up with the dramatic increase in connection parallelism. 18 connections is simply too much of a good thing, and it will present a scaling problem for those who are on small to medium hosts. 10 users hitting at the same time will yield 180 concurrent connections, a pretty significant number for smaller providers.
[Note: This objection was anticipated and handled by the IE team. See below.] Dial-up and cellular network users are also likely to be negatively impacted by this change. In the high broadband world where latency is the dominant factor, greater connection parallelism is a boon. But in bandwidth constrained networks, it just leads to thrash where progress is slowed by all the connections trying to share a small pipe.
I’m curious what sort of testing Microsoft has conducted to determine the impact of this change. The connection parallelism approach is used widely (including by the Virtual Earth team), and some servers may not be ready for the increase. My tests were conducted against only one host, but if similar results are experienced elsewhere, this may fall under the rubric of “don’t break the web.”
My advice to anyone who is using the connection parallelism trick is to perform a similar analysis of your application before IE8 is released. The new connection levels will create greater strain on your servers, and that may lead to occasional performance hiccups for your users. There are a few different approaches you can take to dealing with this change, but the most important first step is to understand the extent to which your application is impacted.
Update: Kris Zyp and Steve Souders have pointed out that IE8 will use 2 connections per host for dial-up users. This nicely addresses that concern, but the concern about 18 connections for pages using the CNAME approach still stands.
Google Code performance improvements: the Souders factor
March 16, 2008 on 1:05 am | In ajax | No CommentsSteve Souders is now at Google, and the Google Code team has taken some of the advice from High Performance Web Sites and applied it to reduce user-perceived latency. There is no magic in their performance improvements — the techniques (JS/CSS concatenation, CSS sprites, and lazy loading) have been discussed here and elsewhere in the past — but the user-centricity of the approach is what I find most cheering.
The explosion of web performance optimization tools and techniques would be meaningless if we were not focused on improving user experience, and the Google Code team clearly understands this message. The last approach they discuss, lazy loading, is a nice illustration. Rather than initializing the Google loader module in the traditional blocking manner (<script src="blah.js"><script>), the team used the non-blocking DOM scripting approach (document.createElement('script'), set src, append to head). A callback on complete of this operation loads the required APIs.
This approach prioritizes the load time of critical user-visible page elements. To understand the effectiveness of this optimization, you need to measure the time at which the user would perceive the page to be loaded as total page download time may overstate the actual latency. Using experience-centric measurements, the Google Code team saw improvements between 30% and 70% depending on page.
IE8: The Performance Implications
March 7, 2008 on 1:25 am | In ajax | No CommentsMix08 is here, and with it the first beta of IE8. John has a great roundup of the JS/Dom work, noting that “Internet Explorer 8 is our release.” He’s right.
I’ll run through a few of the items that have particular implications for performance.
- This one is the most exciting for me: the IE team has finally upped the connection limit to 6 per host from the default of 2. I’ve talked before about DNS tricks to get around the 2 connection limitation, but having this support out of the box will be a great assistance in the war on round-trip latency as it’s easier to make more expensive network calls in parallel. This is especially sweet for Comet and the like where the persistent connection could previously monopolize half of the connections to your site. As you would expect, Joe Walker of DWR is happy.
One thing I haven’t seen mentioned anywhere is the total connection limit. Previous versions supported 2 per host and 6 total. Is the new version 6 per host / 6 total or 6 per host / 18 total. I really doubt it on the latter, but if no one has the answer I’ll grab the beta this weekend and test it out.
- w3c Selectors API — Last month I discussed the work Firefox and WebKit have done to implement the new Selectors API spec, and it’s nice to see Microsoft is joining the list. I share John’s concern that these black boxes have a significant potential (make that inevitability) of browser bugs, so smoothing over these will, as always, remain the job of libraries. But it’s nice to have that blazing speed under the covers.
- DOM Storage and offline events are techniques still on the fringes of relevance. DOM Storage in Firefox 2, as well as Google Gears and its less nerdly cousin Dojo Offline, have a lot of promise, but to this point they’ve lacked a killer app due in no small part to the chicken and egg problem. Having Microsoft on board finally offering these HTML 5 features may help push us to widespread adoption.
- I’ve dinged Microsoft for the lack of a Firebug-like tool since, well, I first used Firebug, and they finally have a clone. A clone in serious need of a makeover. Yeah, I’m shallow. For those keeping score at home, the sexiness hierarchy goes Webkit Inspector > Firebug > IEBug (or whatever it’s eventually called).
- For the truly performance obsessed, there are a collection of optimizations to common low level functionality, such as string concatenation and array manipulation.
All in all, some really cool stuff in this beta. If you want to give it a try without downloading, it’s already up on BrowserCam. Just like this:

Powered by WordPress with Pool theme design by Borja Fernandez.
Entries and comments feeds.
Valid XHTML and CSS. ^Top^