Save the Day with HTTP2 Image Loading by Tobias Baldauf
This is the second talk in a set of three talks on Technical Performance delivered on April 1, 2016 at Fronteers Spring Conference in Amsterdam
Watch high quality video on Vimeo
Download audio (MP3, 24MB)
Images form 64% of all website data & have a high correlation to page load time. Optimizing image delivery through compression alone is a daunting task. Using HTTP2's superpowers, we can optimize images to ship faster, increasing the perceived performance and initiating users' emotional responses to visuals earlier. HTTP2-powered image delivery leads to lower bounce rates and higher conversions.
Whoa, there's some bass.
Take a seat, and we'll come back to you at the Q&A session in a few minutes.
So on to the next talk, to do with the technical aspects of performance.
The next person we have-- again, this dovetails so nicely, because Akamai was mentioned right there and then.
And we're going to build on that right now with a talk from someone who works at Akamai.
Tobias is based at Akamai and where he's a Solutions Architect.
So many people will be familiar with Akamai already as the CDN.
But really understanding how Akamai could be used to improve your performance and the kind of products that they can build to tune your performance and understand that better is the kind of things that Tobias has been involved in.
We heard a little bit earlier on from Estelle about images and optimizing images.
And I think that might be something that we build on in this talk, which is beautifully titled Your Hero Images Need You.
So I'm already excited about that.
So please make him very welcome.
It's Tobias Baldauf.
Just like Jan, I'm probably going to need a second to see if everything's working.
So there's a cursor.
That's a good thing.
And now if the clicker is working, that's going to be even better.
Here we go.
Yeah, like Phil already mentioned, my name is Tobias.
I've been without Akamai for two years, but I've been dabbling with image optimization for the better part of four years now.
Friends of mine have called me the guy who knows most about images, or another friend called me the guy who fixed JPEG compression.
Let me show you none of this is true.
It's all a grand lie and scheme to get me a better pay raise.
But what is true, is I invented block level JPEG compression and a tool that lets you automatically find the best JPEG compression quality for input image.
That has been what I did in the last two years.
Today I am very, very happy and, honestly, super nervous to present something completely new.
So today here at Fronteers I'm going to give a talk I've never given before.
And I'm going to do it on a tool I've never used before.
And I'm going to talk about something I've never talked before.
So if anything goes wrong, all right.
Yeah, like this clicker not working again.
This is not going off well.
Here we go.
So what I am going to talk about today is, I'm going to show you a completely new technique to optimize images, to get better speed index of plus 6%, building on top of all existing best practices.
So everything you know about image optimization or what performs forms optimization, this is something that gives you an extra 6% of boost for the speed index.
And I'm going to show you that it's possible to give a beautiful visual impression tool you use with only 15% of image data sent over the wire.
So I hope you-- so before you boo me off the stage, because that all sounds like too good to be true and a snake oil salesman, let me show you some data.
So like we heard already today before, images are a big concern of ours.
Our way of treating images actually sucks.
If you look at the data from HTTP Archive-- again, HTTP Archive, thank you Steve Souders-- images are the biggest part of each website, currently about 64% of all the data we send over the wire to render web page.
JPEGs make up almost 50% of that.
And because of that, images have a high correlation to page load time and a very high correlation to speed index, obviously, because without an image, the page doesn't look complete.
So what can we do to fix this? And as Jan has already mentioned, HTTP2 is an awesome tool we can look into to optimize the delivery of our bigger and bigger websites.
Images, for example, are currently growing 200 kilobytes per year.
So we need to find a new way to deliver all that massive data, and HTTP2 is here to help us.
Jan already showed a very, very beautiful and very, very complex technological graph.
So I think I can't top this, so I went for licorice.
This is the difference between what HTTP1 can do for your website and what HTTP2 can do for your website.
HTTP1 is like, yeah, the bland little sweety thing.
It's kind of nice, it's licorice.
Who doesn't like it? But HTTP2 is like all the gorgeous tastiness you could ever want in one big package.
And you can just chew it all in.
I love HTTP2.
If this is maybe too much focused on sweets for your tummy, here's a little more technical view of things.
On the left-hand side you'll see what HTTP1 would be doing with a traditional waterfall graph we've all come to know.
You see Chrome in action with six simultaneous connections and that nice little ladder of assets being downloaded.
The demo site you're seeing is set up that it has minimal HTML and CSS footprint.
The CSS inlined, and it just contains 20 images.
But the 20 images ramp up to 1.6 megs,
so it should roughly resemble what the average website currently is holding in terms of image data.
And on the right-hand side, you see what HTTP2 is able to do for your website.
So all the requests start simultaneously.
There is no connection limit or anything.
And everything gets delivered instantaneously.
That is pretty awesome.
That is basically what Jan mentioned as multiplexing.
This is multiplexing in action.
It's very, very, very cool.
By the way, a little footnote.
I just found out like two weeks ago, if you intend to use HTTP2, and I hope you all will be or are already, consider checking your initial congestion window, because you need a different one for HTTP2 than you need for HTTP1.
Just a little hint here.
There's a blog post going to follow on that.
So ingredient number one for the hopefully snake-oil-free solution to deliver images faster by 6% and to get a better visual impression is HTTP2's multiplexing.
But inside that diagram we just saw, there is another ingredient hiding.
And that was in the lowest graph.
That was the browser main thread rendering.
And here you can see 850 milliseconds spent on paint in HTTP1 and only 400 milliseconds spent on paint times in HTTP1.
And you might be wondering, hmm, why is that? And the answer is, if you consider what I'm shipping here, I'm shipping images.
So why is HTTP2 suddenly far more efficient in rendering the website too, although it's just a protocol? Oh, "just a protocol."
It's because it's progressive images.
I'm using progressive JPEGs.
So what are progressive JPEGs? On the left-hand side, you see the traditional baseline sequential JPEG rendering, the window blind effect.
Everybody hates it.
And on the right-hand side, you see gorgeous or, in parts, gorgeous progressive JPEGs.
They start with that little preview that basically gives you the blocky color thingy, and then it gets better and better and better and suddenly it's all there.
This is very, very nice, because it gives your users an indication of how the image is going to look soon, and it gives your browser the ability to lay out the website sooner.
Because with sequential, it doesn't know the height and width of the image until it's fully downloaded, while with progressive, it knows it after the first scan layer.
And this is a term I'm going to be using a lot now-- scan layer.
So scan layers.
You can see different scan layers in this image.
It's an image of the Rhine River that's passing the beautiful town of Dusseldorf.
On the left-hand side, you see scan layer 1.
In the middle you see scan layer 5.
On the right-hand side, you see scan layer 10 where the image is complete.
So how does that work? In progressive JPEGs, we have 10 scan layers that make up the image by default.
The first one you can see is grayscale, and in that resolution it looks OK, but it's actually quite blocky.
I'm going to show you that in a second.
Then in the second layer, you can see that the Rhine is suddenly purple, which is weird, right? But the image has at least color now.
Then it gets better and better and better and better until all the data has been sent, and the image renders completely in scan level 10.
This is the default, how this is supposed to work.
And there is a code for that hidden inside CJPEG and inside it switches on how it's going to do this.
This is called the scan file.
It is aptly hidden under the section called "For Wizards."
So I didn't bring my Wizards hat, but I'm kind of proud that I touched this, because not many people have touched this kind of stuff in the last couple of years.
So I'm really happy that I got to play with it.
And this is what the scanned file is supposed to do.
JPEG encodes its image information in a brightness channel and two color channels, that is, red and blue.
So that's YCRMCB.
CRM The first, the yellow circles, tell the JPEG encoder which color information channel to send from.
So 0 is brightness and 1 is blue channel and 2 is red channel.
And the things I circled in green tell the JPEG encoder which pixels from each 8 x 8 block to send.
Because JPEG iterates over an image by 8 x 8 blocks.
So that means we have 64 pixels to send information on.
And JPEG is doing that, if you don't tell it to do differently, in zig zag order.
So not [SOUND EFFECT] like sequential but more like this.
That's the a zig zag order.
So 0 to 63 are the positions of pixels in which information we want to send from them.
So this is how the 10 scan layers get created, by this command.
This is what the CJPEG does natively if you don't tell it to do anything else.
And this is what the first scan layer of that output would look like.
I showed you the preview before.
This is it in its full HD glory, completely pixellated, grayscale.
But the good news is, over HTTP2, pretty quickly to render, which means the browser can at least show something, and it can lay out the page.
So that's the benefit of progressive JPEGs here.
So the first scan layer, it ships fast, and it shows soon.
That's good for the browser.
That's really, really good.
And this is the crucial element here.
This initial scan layer shipping benefits immensely from HTTP2.
While we saw with HTTP1, only maximally five images could get the first scan layer out in our test site.
With HTTP2, we could get, theoretically, infinite number of scan layers out initially, which makes the browser far more efficient in laying out the page and showing something to the users, which is good.
So this is something to keep in mind.
So progressive JPEGs are another important part of that, hopefully snake-oil-free solution.
So you might ask yourself, OK, that's all nice and well, but how fast is that actually doing? How fast can I go with HTTP2 and progressive JPEGs? Because at the end of the day, we're trying to make the website faster for everybody, so we want to have some seconds-- milliseconds data to go on.
So this is WebPagetest results for the test site.
By the way, before anybody throws something heavy at me, this is all median of medians run over days.
So this is quantifiable.
Thank you very much.
So, again, I've circled in the good parts and the bad parts.
HTTP1 basically loses out all across the board.
Load time is worse, start render is worse.
The speed index, which is the crucial element for my talk here, is worse.
And, of course, then the fully loaded is worse.
HTTP2 wins everywhere.
The important bit being that the speed index, as you can clearly see, is more than a second faster.
That's really, really, really cool.
So just shaving off an entire second of load time just because you've switched protocols.
So the work HTTP2 is doing for us is fantastic.
It's really, really good.
But it's also great for the user experience because, as you can see in this visual progression graph from WebPagetest, the progressive JPEGs ship much faster over HTTP2 because of their separate scan layers that are shipping.
And thus gives the users a faster visual impression of how the entire site is going to work.
HTTP2 is the red graph that clearly wins everything versus HTTP1.
That's pretty, pretty neat.
I also brought you a video of that, and I'm hoping with my flimsy excuse of a presentation framework that this is going to play.
Oh, no, it's not.
Of course not.
[SOUND EFFECT] Ah, you cursor.
There you go.
On the green is HTTP2, obviously.
So this is with 0.1 seconds
resolution in the video.
And as you can see, HTTP2 clearly wins out.
HTTP1 is still going while I'm talking.
So that's really, really cool.
HTTP2 renders the website much, much faster.
But now, you might say, well Tobias, that's all nice and well, but HTTP2 has so many changes.
Like Jan already mentioned, there are so many cool things happening under the hood, we can't really see the benefit of progressive JPEGs here.
Maybe sequential JPEGs render faster on HTTP2 too.
So, yeah, so how about we just check out our sequential JPEGs, these window blind JPEGs are doing, because then we get an actual number as to how much benefit we have from progressive JPEGs.
So, again, I circled it.
The speed index is more than 350 milliseconds better for progressive JPEG, because the browser can show more stuff sooner.
The user gets an impression of lots of the images much, much sooner than with sequential baseline.
So that's really, really good.
Again, you can see it in the visual graph.
I mean, now the gap is a little closer, but you can still clearly see that progressive JPEG wins out against sequential JPEG even if we used HTTP2 for both instances.
And, again, for the effect, I also provided a video for it.
Here we go.
Sequential on the left, progressive on the right.
And here we go.
That's pretty neat.
You can see that in some parts, the competition is closer, like we saw with the two visual graphs being closer together.
But the user experience of the images showing up like this and lots of them showing up like this is much better than the window blind effect we have.
So that's all nice and well.
And that's something we should have been doing for ages.
Progressive JPEGs aren't very well used right now.
I think about 12% of the JPEGs I sampled currently have progressive encoding, which should be much, much better, especially with HTTP2.
But now you might ask yourself, well, that's all nice and well, but I could have done that without listening to you, Tobias.
And you're right, you could have.
But maybe I can show your something to make things even faster.
And, yeah, we can go even faster.
So to go even faster, we have to look at the guts of JPEG encoding, at the viscera.
And here's my version of progressive JPEG encoding viscera.
This is the scan script that I wrote instead of relying on the default level.
So what am I doing here? You can clearly see that I'm not shipping 10 scan levels.
I'm only shipping five.
In the first scan level, I ship information which is called the direct current, the first bit of a 64-bit metrics, all at once for all three channels.
That guarantees me that my initial preview has color, unlike the preview we just saw with grayscale.
In the second one, I ship half of all the brightness values.
In the third and fourth, I ship all the remaining color information.
In the fifth, I just provide the left-over brightness information that it probably hopefully won't need.
So this is what I came up with.
And here's how it looks like.
That's the initial scan.
Pretty good, eh? I mean, compared to the black and white blocky thing, that's pretty good, because just the initial scan of the whole thing.
And here is the kicker.
This is scan level two out of five instead of scan level 10.
And this is already a very, very close approximation of the final image.
Now, with the third and fourth scan layer, we just get more color information.
But let's go back two.
You can still see that in scan level two, the image already looks OK.
So here's a little more color and then finally-- now pay close attention.
You see nothing.
The fifth scan layer really doesn't matter at all anymore.
After the second scan, we've provided so much good visual information already, that the image looks pretty OK to most of the users.
So that's really, really nice.
So let's see how much faster that technique renders in comparison to standard progressive JPEG.
I have aptly called that optimized progressive JPEG, so OPJPEG instead of PJPEG.
So here, things get a little more heterogeneous.
It's not such a clear-cut yes-and-no, black-and-white answer anymore.
You can clearly see that I am shipping a little more data, because now my scan layer is a little more heavier at the beginning.
So the progressive JPEGs with 10 scan layers are a little smaller, therefore the visual complete is a little faster.
But the speed index is another 110 milliseconds faster, which averages out in median of medians for 6%, the famous 6% I mentioned in the beginning.
Woohoo! Here they are.
So with my technique, I can shave off another 6% of the speed index of the site.
Again, this is quantifiable.
And here's the visual graph for that.
And again, you can clearly see it's not as black and white as it was before.
But we see after the initial hit that we take, because my initial scan layer is now a little heavier, which is why the red bar is a little to the back, it overtakes progressive JPEG visual progression after about 1/3 of the image data.
And then it is superior all the way out.
So with this technique of manipulating the scan layer creation process, we can successfully manipulate the speed index to another whopping 6% on top of everything we've known before.
So this is something I'm really, really proud of.
And this is something your users will really, really be happy about, because they can see their beloved cat images much, much sooner in better resolution.
So time for takeaways.
Ingredient number one, please, please start using HTTP2.
It just works.
It's out of beta at Akamai.
It's free for everybody to use.
Thanks to Let's Encrypt, it's just a brilliant piece of technology.
There's really no reason why you shouldn't be using HTTP2.
Second, please, please, please again start using progressive JPEGs people.
I mean, it's really easy.
Every JPEG encoder can do it now.
By the way, the best traffic encoder is Mozjpeg.
So maybe you want to switch to that.
But every JPEG encoder can do progressive JPEG encoding now.
And when they can do it and, like I said, most of them can, you can supply them with a separate scan.txt
file containing your own wizard hackery optimizing the scan level creation process, which will yield you a whopping 15% to image data to start rendering something meaningful to your users.
And it will improve the speed index of your sites by 57%, with 6% of that just being because you optimized the scan layers.
So that's it.
Thank you very much.