Progressive Downloads and Rendering by Stoyan Stefanov
[0:06] ...a bit about performance. This is me, I work at Yahoo and I've written a bunch of stuff. I'm going to talk about progressive downloads and progressive rendering and basically after we've done all the basic stuff like GZipping, minification and all that, what to do next to make the pages faster or just appear faster. Just the tweak the perception of speed.
[0:34] So why is it important to talk about this stuff is because people are really sensitive of their time. You know the saying that time is money as incorrect as that may be people are really sensitive when you waste their time. So they really don't like to wait. There are all sorts of, even physiological changes that happen when you're waiting, people get more irritated and just, you know, high blood pressure and what not. So we want to make people happy.
[1:12] Just talking about the perception - so this is sort of a theme of visual illusion. If you look at square A and square B, one appears to be black; the other one appears to be white. But they are actually the exact same shade of gray. The thing is you cannot really see that until you start hiding... Yeah, in part of the...
[1:41] Once you hide everything on the page then all of a sudden it kind of pops out that they're actually that same color. But we couldn't tell that. The thing is once you start bringing them back all of a sudden - although you were just convinced that it's the exact same color, you cannot see it anymore. Again, A is black and B is white.
[2:08] Another kind of similar thing is this one here, two gradients, one on top of the other. Until you hide the background, in which case you see the reflection is all color. If that was too monochromatic... So there is this brown cube here and this orange one are actually the same color.
[2:37] The reason for those was just to show that we are flawed human beings. We have a distorted idea of reality and even when it comes to seeing, which we have a big portion of the brain dedicated to seeing because that's the thing we do the most in our lives, we look at and are processing information all the time. So if you can't tell black from white chances are we aren't very good at telling anything really.
[3:06] So when it comes about performance we care about perception of time and durations. So if something takes some amount of time, but people come to your page with certain expectations, maybe they visited a competitor or they have some idea of how long this page should take. So if their expectation doesn't match - usually it won't match - then they might perceive that as actually being slower than it is.
[3:37] Then there's additional error related to the retrieval, storage and retrieval of information. Once you want to tell about this page to your friends and family and if you perceive it as slower than it is, chances are you'll remember it was even slower. And we don't want that.
So when, story short, people perceive is something is slower when it's unpleasant or unknown, when it's too boring or they're just too busy and have too many things to keep track of. So we want to make sure that the first page experience really very fast because when it's unfamiliar, when the user comes to the page for the first time, they will perceive it as slower than it actually is because there's all this new stuff. You don't know where anything is really. So they have to process more information.
[4:41] The idea is that we should optimize, really, the empty cache. Otherwise there will be no full cache experience. There will be no repeat visits.
So the basics are that you have to list the best practices for those two. Perfplanet is a project of my own. It's just a collection of feeds about performance, feeds and blogs.
[5:43] We have progressive enhancement, which is more like a development process or philosophy, it doesn't have to do much with the way the page is download. So I'm not going to talk about it.
[5:59] What I want to talk about was progressive downloads or in other words make sure that once you have the waterfall of components that a page requires, they're downloaded as quickly as possible, they don't block each other, they're not in each others way, they're loaded asynchronously if possible. So that those waterfalls are as free flowing as possible.
[6:23] Here's one example, just a test page. This is the exact same page in both waterfalls, but the second one takes half of the time than the first one. Just by making sure that there's not much blocking going on. It's really the same page, same features, same images, so everything's the same. This is how important it is.
[6:48] We talk about progressive rendering; we want to make sure that we render something very quickly at the beginning, ideally within 200 milliseconds. That will be perceived as instantaneous by the user. Even though we're not ready with a complete page, we want to flush something out and just give the idea that something is going on.
If you have scripts, they should be combined into a single file or loaded asynchronously. So this way it kind of makes the waterfall ridiculous because you have all this blocking, blocking, blocking, which is kind of sad. Kitten.
[8:22] ...or maybe this one.
[9:24] This is in Safari. They have, now, load events so that you can call a function once it arrives.
[9:34] How that looks on a timeline is the deferred script will be downloaded and executed in order before the DOM content loaded event happens and all the asynchronous stuff will be executed whenever it arrives, but before the load event.
[10:15] So about flushing. Usually when you visit a page, when first you download the HTML for it, then once the browser is done, then it looks at the HTML, figures out, "Oh, I need all those components, those images, CSS, and so on. So let me go ahead and flush them." But if you flush early, you have a chance to start downloading stuff even before the whole HTML is completed.
[10:44] I think it's a really well known technique, but it's not nearly as widely used as it should be.
[10:52] So whenever you flush something small, which refers to external components, then the browser can start fetching those instead of just sitting idle. It's really easy to do. In PHP you have a flush function. It pretty much exists in every scripting language.
[11:12] The theory behind is that in HTTP 1.1 there is the chunked encoding that was added. So if you send the HTTP headers, then send different chunks by first saying, "OK. This is the size of the chunk and then the chunk itself." So there's two approaches to implementing this.
One is called semantic application chunks. So when your application knows about its different parts like headers, footers, sidebars, you can flush there so that you send only the header, for example.
The other approach is just not bothering with it in the application, but have your server flush it. For example, on Google search if you don't request gzip, if your browser doesn't support gzip they'll send every 4K in different chunks. But if you do they'll do a more semantic flush.
[12:17] So as soon as you type something and you hit enter in normal Google instant interface, you get the headers flushed immediately. So it usually arrives in something like 100-200 milliseconds. So this is great. This is just awesome because as soon as you type something, something appears and then you're not done with retrieving all the search results and as you can see there's no counter here or anything. The only dynamic part is the query. It's pretty much a static header. They can send it right away and you get a feeling that, "Oh, this page is responding so fast," while they're actually working on the results.
So even if something bad happens to the connection and that thing never arrives then it's fine because you have a usable page. In terms of code it looks something like this. Where you have the header, flush by chunk, the body of the page, flush with chunk, and then just include the script at the very end. Let me give this script a chance to run, it might take a while.
[14:01] In Amazon they have a combination of using the chunks with the source order. So they'll flush the header and then they'll flush - because when you think about what's the most important thing on that page, it's the buy button. So they want to make sure that it's, the reason for this page to exist is to sell books, right? Or anything. So they want to make sure that it's available immediately and they put it higher in the source order. So after then comes the image, the title and then all of the other stuff, may arrive whenever it arrives. Like all the reviews and that stuff.
[14:40] And so this is Google Instant, right? I just wanted to make the point that the HTTP chunking is not necessarily only for HTML. In this case in Google Instant, when you type they send a request to get the search suggestions and to update the results. So the thing that I've noticed is that the response is not in any particular format. It's sort of delimited chunks of JSON data. And they have them actually sent in two chunks.
[15:15] The first one is the suggestions and the second one is the result. So the reason being sometimes you only get suggestions, right? If they're not confident that this is what you searched for, they will not return all the results. But when you have both suggestions and results in the page, you're probably much more interested in looking at, as you type, looking at the suggestions. So they'll flush that first and then whenever the results are ready they'll flush that second.
[15:50] Yeah, so unfortunately there's not any good tools to figure that stuff out. So there's a URL here, some script that I did in PHP. You can type in a URL. It's very clumsy and horrendously slow, but it's just something if you're curious to look at what other people are doing with chunking.
[16:14] So CSS and rendering. So after the previous talk I know this slide is going to look really bad. So what is the worst enemy in progressive rendering? And it is the CSS. [laughter] So, I mean nothing bad of course. But just the thing is that CSS will block the rendering. That's why I coined the words component type. Because images, they can arrive whenever. But until the last piece of CSS arrives on the page, the browser will not render anything.
[16:52] And that includes even @media print or other @media types that are not necessary at all to display the page. But it's just the way most browsers work, except Opera. Opera will not wait for the very last piece of CSS, but most of the other browsers will. And that's really bad.
[17:17] So if you have several CSS files, until the very last one is downloaded, even if it's @media print, then the user just stares at a blank page. Which is not something that we want to do. So you can inline all the @media rules. Because most of the time for printing probably you just want to disable some stuff. Maybe hide some navigation, something like this. So they're usually small, it doesn't make sense to put it in a separate file.
[17:48] And so for example, these are two pages. So this is a screen shot from web pagetest. And this green line here is the moment that something is rendered on the page. And the blue line the onload event. So two different pages. So this one, the initial rendering took off about 0.3 milliseconds, which is great. And this one here, so the next page about 200 milliseconds later. And that's only because one of the component types is a CSS. Because nothing will get rendered until then. Where the previous page did not have any CSS.
[19:13] But if you're really, really worried about performance and making things as fast as possible, there's always the option of inlining it. And that's what Google search does. There's no external CSS. Although it's a high traffic website. It will save probably quite a bit if they put it in an external file and cache it. But they've chosen to put it with the inline and everything in the first chunk so that it's rendered immediately.
[20:16] So it's also a good idea if you don't use CDN. If you don't have money for CDN, but you still split components across different domains, it makes sense to put the CSS on the same domain as HTML. So that there's no extra DNS look up. So you save a bit of time with the DNS look up, so the browser renders something quicker.
[20:40] So conditional comments and blocking. So this is a normal page with a normal CSS right here. And this is the same page but the style sheet is included with conditional comment. And curiously enough it blocks the rest of the components, and that's really bad. So it's strange, right? So you do something as common as something like this, where you have IE specific stuff. And as soon as you do that then in IE you will have, your CSS will block the rest of the stuff.
Another common scenario is this one. I think Paul Irish came up with this thing? Where you just include class names depending on the browser. But in this case this will also, although it's not an external instruction, but it also blocks the rest of the downloads. So the solutions are kind of strange, but they work. So you can put an empty conditional comment way at the top before there's any component.
[21:52] So I guess IE will parse once through the document, look for those. And if it sees a conditional comment way at the top then it will no longer block on the rest. So you just include some dummy conditional comment there and then it doesn't block anymore. For the other case where you have conditional class names you can add them to the HTML tag instead of the body tag. So that they're executed early on in the document and they don't block anymore.
[22:31] Preloading. Preloading is a good way to cheat. Cheat your way into faster pages, right? So when you're on one page and you have a high confidence that the user will end up on the next page. Then your first page, page A, can start prefetching the components needed by page B. For example you're typing your user name in a log in screen, but you can anticipate what the next page will be.
[24:27] But turns out that in Firefox and WebKit there's a separate cache for images separate from the cache for scripts and styles. So for those you can use an object tag. Basically create an object tag invisible with zero width and height. And then set that data to the location of the file. So that's a way to preload without executing.
[24:56] So blocking can happen in all kinds of unexpected places like favicon and is kind of ridiculous. So in all browsers when you have this little favicon it will be downloaded when the browser is just sitting idle. When you have, the page is kind of settled and then the browser says OK, let me fetch favicon. But not in IE, where it's the first component that gets downloaded right after onload.
Data URIs. I think you're all familiar with that. But that's also a great way to reduce the number of HTTP requests. Is if the image is inside, we put them in the sprites, right? Which are kind of painful to create and maintain, although there are some tools. But the other option is to use the data URI. Which is pretty simple. You just use base64 encoding to encode the binary content of an image or anything really. And then include it in the CSS using a background-image:url() instead of HTTP whatever. Then you just put actually input the data for the image.
[26:57] And you can do the same with image tags, inline that there. And this is not really something theoretical, it's used by the most popular websites out there. So in Yahoo search for example, this example shows that there's actually a gradient here. And it's in a data URI. And this is Google. So all those images are actually inline in the HTML and not separate components.
[27:32] Which may sound kind of strange, right? Because when you inline all this stuff into the page you made the page bigger and there's no caching. But it turns out that it's that important to have as few requests as possible, just to reduce the number of requests. And in this particular application in Google search, chances are you don't often search for the exact same thing more than once. So they don't really benefit much from the caching so they decided to inline that.
[29:34] The data URIs, the problem is that they only work in IE8 and above, but the thing is that for the other browsers, there's always the thing called MHTML. So this is like my email. In emails, you have multi parts in the email. For example, you have the HTML version of the email, the text version of the email, and then all the attachments are inline in a single document. So this is the same idea for that MHTML. It works in IE6 and 7. For the longest time, there was kind of a misunderstanding with...
[30:27] There was a problem with IE7 and Vista and Windows 7. It doesn't work properly, so developers have come up with all kinds of crazy hacks to work around this, but it turns out it's actually really simple. It's kind of just using the correct syntax because the incorrect syntax without... Let me see if I can show you.
[30:53] So this is the MHTML, it's a multipart document. One that has different parts in it. So, one part looks like this, you have a bunch of headers, then the cue line, and then the actual content. Then, the multipart parts are separated by some sort of string that you decide what string that will be and put it right here in the boundary. So you just have to use the syntax.
[31:20] We have two dashes, the separator, and the very last one is to have two dashes at the end. I call them the double dashes of doom because I spent so much time fighting with IE7 and Vista. The thing is it works fine in all other versions of IE, this particular combination, which is heading errors. It was working fine the first time, but then when you refresh and this thing is cached, then it just wouldn't work.
[31:50] So, it came to me, the whole thing really is just an example of a CSS file. So it has a comment here, it has that multipart document with the different parts here. At the end, when you have the actual CSS file, then you just refer, unfortunately, you'll have to use absolute URL, and refer to that. So, this is an identifier of the image. It will come up when we declare here. Then you can see the image that I wanted you to.
[32:27] This is really unreadable, but the idea is that you can do for inline CSS, too. So you can have comments, here we built a multipart document, then you have the actual CSS and everything is inline. So that means that you can actually build cross-browser single request level applications where you just do one page and that's it, no more requests. All the script is inline, all the CSS in inline, and all of the images are inline, and it works across all browsers. So, it's kind of... If you're going to be doing this, it's kind of extreme, because you have to...
[microphone cuts out]
[33:03] Was that me?
[33:13] All right. It's kind of, you have to worry about separating the content from the presentation. So, many times, we need aspects in performance or in relation, it's a tradeoff. If you really, really want to do it, you are able to.
So, one drawback of having the cross-browser solution when you have both MHTML and data URIs in the same file, so if you have any sort of site sniffing, then, the problem is that you have to repeat the same image twice. So the solutions are to look in the browser-specific CSS and do the server-site sniffing or, there was somebody who commented in my blog, if you keep them close together because of the better compression and you use them, the duplication is not so bad.
[34:20] Or there's one crazy hack from this Russian website. Unfortunately, this is only in Russian. I've been meaning to sort of...
[34:30] ...and bring it to a more wider audience, but it's just some crazy stuff. In the same document, you have the MHTML and the data URI and the way it works is you put here the header of the JPEG file, this is 6400 encoded, then you put the normal CSS declaration, and then you start, again, you put just the header part, and then the rest of the file.
[35:00] So, in reality, you have something that looks like this. You have JPEG headers, you have some CSS, JPEG, and then all the data above the image. So what IE sees is the JPEG header, then some garbage in the JPEG file that it ignores, and then the rest of the file. What other browsers will see, this is an invalid CSS, and it's like whatever, and then it will move onto the normal image. It works for JPEGs and it gets a little more complicated for PNGs and GIF, but it really does work. It's a bit crazy and really inventive, but it works.
[35:45] One last point that I wanted to make is when you tweak the perception of loading the page to use animation, to kind of pretend that you're actually ready with something, but you actually are not. So, for example, animations are usually not really good for performance reasons because it's obviously more code, not in CSS, we don't have to write that much code, but it's usually abused by developers, "OK, let me animate this thing only because I can, and I'll make it really slow and painful because I like to do animations." But they can be used for good, not only for evil.
[36:30] These are two examples of... I don't know if the WiFi is... Thanks! That seems to be better. I don't know, I don't speak that language.
[37:28] So, as you see here, anything in Google search, you have those extra options, which is very nice, which means you can click and use this animation. It looks as if all the content is there, but actually, it has to go and fetch it from the server. But they're confident in their response that they can actually start animating it and pretend that it's there, it looks like it's there, but it's not.
They just do this nice animation just to keep the page responsive and the user will see that something is happening, and it doesn't have to see that loading indicator, which is not something nice to see. Because the first reflex, I guess, when you make an XHR request or anything to get more data,
[37:53] you just put that loading image indication to tell you that something's happening. But, if these are really short, then you don't have to do it because then, the whole page appears as if it's less responsive. You click something, and then you wait, click and wait, and, obviously, we don't like to wait.
[38:38] So let me try now, if I reload this page, and then you see it's here, but it wasn't last time I tested, maybe it's just the Dutch site. When you go here, you are actually seeing animation but no content. So, it's a nice approach if you have something that takes a while or something that shouldn't take a while, don't start with the loading indication right away, but if, for whatever reason, because it's networked, everything happens... So for example, you can set up loading indication after 200 milliseconds or something that if there's any unexpected delays, just to keep the user updated.
[39:36] Another example, this happened yesterday, [mumbles] ... So here, you have those kind of images here on some popular articles, and you can click through them and see the next one. So, the same thing, you get a nice animation that tells you, OK, this thing is here, but it's actually not. So if you turn off... If you get only the animation without the other thing, so it's just pretending that all of the content is there while fetching. It's much quicker than as soon as you click, it will just slap a loading indicator into your page, not very friendly.
[40:39] I think that's pretty much all I have to say. Parting words, yeah. So, yeah, there's two important things that I want to say is that something you should never say is that everyone is on high-speed, everybody's on broadband these days. That's not true, but even if it was, the way our protocols work and the way that you do an application, you really want things that much from the good connection.
And another thing not to say is, "It's all in the cache, oh, I don't care about this stuff. I'll just put it in. It's always cache anyway, right?" Well, most of the time, it's not. Actually, there was confirmed from other companies that I talked to that pretty much that everybody they had knowledge of, about 50% of all the users come to a page with empty cache for various reasons. So it's not all in the cache.
So, if you want to make high performance and responsive applications, it's great to consider non-blocking, asynchronous downloads just to look at the waterfalls. It's so easy now with web page tests. You don't have to install anything. You just type in the URL and you can see from different locations in the world, in different browsers, and just look at the waterfall and see what happens. There might be surprises along the way.
[42:00] Progressive rendering, if you can, do a quick flash and just display something in the first 200 milliseconds. That will be incredible, and users will love it. They'll feel that the applications are very responsive.
[42:14] That's all I have. Any questions?
[42:18] Questions? We can talk about it in the pub later, right? Thanks again very much.