Towards Faster, Safer Websites by Yan Zhu
This is the first talk in a set of three talks on Technical Performance delivered on April 1, 2016 at Fronteers Spring Conference in Amsterdam
Everyone in 2016 knows that websites should use HTTPS. However, there is a common misconception that TLS and other security measures slow down both web developers and page load times. This talk will show you some easy tricks to make your site more secure without sacrificing performance.
And our first speaker, over from the states, is Yan Zhu, who works at Brave, which is a company that builds a browser which is very much optimized for performance and for security.
And Yan is really prolific.
A member of the Electronic Freedom Foundation, worked on SecureDrop, Let's Encrypt, bringing digital certificate to the masses, former member of the W3C tag, just someone who's very, very busy and prolific.
So here today to talk about tools faster, safer websites, please welcome Yan Zhu everyone.
So I have to unlock like four different things before my presentation starts.
So hopefully less than half of this presentation will be me typing in various passwords.
Sorry about that.
I have to type in-- OK, great.
Now we can finally start, hopefully.
So my presentation is called How to Make Websites Slow and Unsafe.
It's April Fools'.
Happy April Fools', everyone.
Yeah, Do they have that in the Netherlands? Probably had it before the US, actually.
So is the web fast, yet? Who thinks it's fast? Who doesn't? Well, that's kind of an open question.
And that's not actually one I'm going to address in this talk.
But let's actually just look at some data about how fast the web is.
This again, is from its HTTP Archive, which Estelle pointed out earlier.
So from 2014 to 2016, you see that the size of requests has gone up pretty steadily, especially CSS.
Like people are using bigger and bigger CSS files.
The number of TCP connections per site has actually remained around the same.
Sites with Flash have gone down, which is great from a security perspective.
And sites with HTTPS has gone up from 9% to 27%.
That's a factor of three.
So we should be very happy about that.
So is TLS fast yet? That's a really great question because even three years ago, or even one year ago, a lot of people said, well, we can't deploy HTTPS.
Its just too slow.
We can't do it.
The performance isn't good enough.
But I think they're wrong by now.
As an example, even Netflix now is moving to HTTPS.
They just announced this about a year ago.
Netflix, I don't know if it's that big in the Netherlands, but in the US, it accounts for the plurality of internet traffic.
So in 2015, you could say you would Netflix and chill.
In 2016, you can Netflix over HTTPS and chill.
Just to illustrate how dramatic this will be, on the before side, 65% of web traffic in North America at peak hours was unencrypted.
This is data from an ISP called Sandvine.
And in 2016, the red and green parts have switched, so instead of 65% unencrypted, it's about 65% encrypted traffic.
And that's solely due to Netflix.
If you need more convincing that TLS is fast now, there is a great site by Ilya Grigorik from Google called IsTLSfastyet.com.
And basically he makes the point that, yes, TLS is fast because we have good CPUs, better than we did in the early '90s, at least.
And we have all these features that can speed up HTTPS deployments.
I'll go into two of those that I think are really interesting.
So how many people here are familiar with the TLS handshake? Cool, about half the room.
So just to reiterate-- your client does a SYN-ACK with the server.
And then it says, hello, I would like to start TLS.
Server says, OK, I support TLS.
Here's my cert.
Client says, OK, here's the key we're going to use.
It's a little more complicated if you use forward secrecy, but the point is, there's three round trips that you need before you can even start sending encrypted data.
And that three round trips is what people usually think of when they say, TLS is expensive or CPU intensive.
But note that if you use session resumption, which is a feature where the server stores data from the last connection the client made so it can resume using the keys from the last session, then you can actually get this down to two round trips.
So if you want to learn more about this, just google session resumption.
But most servers will do this by default.
And you can say, like, rotate session tickets every 5 minutes or something.
And so that actually makes these handshakes less expensive.
In TLS 1.3, there
is actually going to be another optimization to get down to zero round trips, although I think SYN-ACK might still be there.
But anyway, there's ways you can make there be less round trips.
Another TLS optimization is HTTP/2.
So you might say, wait, how is that TLS.
This is HTTP/2.
It's just like HTTP/1, the next version of it-- 2.
And what is HTTP/2, anyway? Well, so HTTP/1 was created around 1995, back when sites looked like this.
You would have this header and you would have content.
And that would be great.
You could be accessible.
It was nice, it was fast.
And now sites look more like this.
You have an ad, another ad, a content somewhere like a Buzzfeed or iframe, another ad, some tracking pixels, some Flash ads, et cetera.
So sites are not this anymore.
They're slow and they make a lot of requests.
So why would we even use the same protocol.
It's actually kind of amazing we're still using HTTP/1 in a world where websites are so different.
And so that's part of why HTTP/2 exists.
I think of it as HTTP/1 on Adderall.
So it uses binary encoding instead of text and header and compression.
It has multiple performance improvements.
But the one I want to point out is that it makes multiple requests per TCP connection.
The next speaker will get more into this.
But essentially instead of making a request and waiting for a response and then making another request, the browser can just make a request and then use that same TCB connection and then get the responses back in this multiplex way, which is quite nice.
And so this largely gets rid of weird hacks that people have to do like inlining and spreading and concatenating various files so that you can minimize the number of TCP connections.
Does HTTP/2 require encryption.
So in the standardization process, this was a huge question.
Should HTTP/2 be available only over HTTPS? And the answer is no according to the spec.
But in practice, actually, yes, because there's two ways you can upgrade to HTTP/2.
Method 1 is the client just sends an upgrade header and the servers says, OK, that's cool, let's do HTTP/2.
Method is that the client in the TLS hello from the TLS handshake says, I support HTTP/2.
And then the server upgrades.
So method 1 is great if you want to support clients like cURL.
But the major browsers like Chrome and Firefox and I think now Edge has announced that they'll only support method 2 over HTTP/2.
So if you want this nice multiplexing and these nice performance improvements, you have to deploy HTTPS, unless you only want to talk to people using cURL, which you might.
And so that's what it looks like in a Wireshark dump.
My browser, Brave, connected to Akamai.
And if you look at the client hello, you see this ALPN Next protocol field, and that says H2.
And that's me saying I support HTTP/2, please upgrade me.
But you might say, well, that's great, but HTTPS is really annoying to set up and maintain.
And you would be pretty correct, at least in 2015, you would have.
Because to get a certificate, you have to do this tedious-- so this is from Dream Host, and it's this long 12-step process.
And at the end, there is a red warning bar that says, don't accidentally copy and paste your private key into our website, et cetera.
And let's say you got through this.
Usually this takes people an hour or two when they do it for the first time.
But say you get through that and you have a certificate.
Now you want to set up HTTPS.
Then you might find that you don't know how to pick the correct encryption algorithms.
So our RC4 was a popular cipher that people used up until 2014.
But then as cryptanalytic attacks against it got better and better, experts started saying, actually don't use RC4 anymore.
So unless you're keeping up to date with these latest encryption attacks, then you might not realize that this is now a vulnerable algorithm.
So let's say you get past that and now your TLS is correctly configured.
You go to your site and you (say, https)//Lenova.com.
But then your site is broken because of something called mixed content blocking.
Essentially, if you switched your top level site to HTTPS but you're including resources from HTTP origins, the browser says, no, we don't want that because that's not secure, we're just going to block it, actually.
And it's a huge mess.
So I have good news, which is that when I was at the W3C, we actually wrote a spec to solve this exact problem.
It's called Upgrade and Secure Requests.
So essentially, if you find yourself in this situation, which many of you might, you can just send us HTTP header that says, Upgrade and Secure Requests and modern browsers will treat the page as if the subresources are HTTPS.
I actually had to use this a few weeks ago on my WordPress blog because I updated WordPress and somehow it downgraded all my HTTPS pictures to HTTP.
So I was lazy and instead of manually fixing all the HTTP links, I just said Upgrade and Secure Requests, and then voila, Chrome upgraded everything and my blog works again, which is great.
But TLS, and arguable, is still kind of difficult.
So at EFF, the Electronic Frontier Foundation, we said what if we started a certificate authority.
Sounds like a good idea, right.
Sounds kind of hard.
And what if we used our certificate authority to give everyone free certificates in five seconds or less.
And what if we also gave people packages so that they could automatically deploy TLS in an easier way.
And so that's what Let's Encrypt is.
How many people here have heard of it? Yeah, well, that's almost everyone, so good.
Let's Encrypt is a new CA.
It was created by EFF, Mozilla, University of Michigan.
Thank you very much to Cisco and Akamai for being top level sponsors.
We actually did not create a new root CA.
We're currently an intermediate CA that's been signed by a CA named IdenTrust.
So thanks to them for the cross signature.
That means essentially browsers already trust Let's Encrypt.
We are going to get into the root certificate programs for Mozilla, Firefox, and Chrome, and IE.
But that takes about a year.
So in the mean time, we're an intermediate CA.
And a new nonprofit called the ISRG is managing all of this work.
Thank you, again, to our sponsors.
So our current status is as of last fall, we entered private beta last fall.
We are now in public beta.
In our first eight hours, we issued about 10,000 certificates.
That's one certificate every three seconds.
As of yesterday, we had issued about 1.4 million certificates,
which makes us by volume one of the largest CAs, already.
And that's our activity graph.
The top line is the number of total issued certificates over time.
So here's some cool data.
What's the most popular TLD that's been using Let's Encrypt? Well, it's actually dot-com and other, but the biggest country TLD is France.
So thank you, France, followed by Germany.
And like I said earlier, we have these different clients that are using Let's Encrypt.
So who are these clients? Well, there's various types of servers that could be using HTTPS.
So I call this the client layer cake.
I last showed this slide around Halloween, so it's Halloween themed.
But we can pretend it's Easter themed now.
So at the top is people like Yahoo and Google and Facebook.
They basically have their own fancy TLS infrastructure.
Some of them have their own server software that no one else uses in the open source world.
Right below that is sites that run multiple servers that are load balanced.
A lot of startups do that and a lot of smaller companies.
And then there's people like me who use Digital Ocean and AWS.
But basically we have a single server that's self-hosted or managed-hosted.
And then at the bottom layer, there is this large number of people who are just using DreamHost or WordPress where they don't have a terminal.
They just have a web interface that they go on and then do stuff.
So we want to eventually cover at least the bottom three layers.
But right now, Let's Encrypt is mostly good for the middle two.
The bottom one will require partnering with DreamHost and WordPress, et cetera.
So here is a nice pie chart of some of our clients.
I won't get into it very much, except to note that the majority is using our default client, which we ship as a Let's Encrypt package.
And it's written in Python.
This one is kind of harder to read, but it's the client operating systems.
One thing to note is that while a lot of people are using the latest Debian of Ubuntu, there's people using Debian 7 and Ubuntu 12 and stuff.
And there's a long tail of old operating systems and people who will never update.
So it's an open question of how to keep those people secure without holding back the entire internet.
So if anyone is interested in building their own Let's Encrypt client, this is just a quick picture of how it works.
So basically, the client has to contact the Let's Encrypt CA and say, I want to prove to you that this domain is authentic so you can issue a certificate to it.
So there's various types of challenges that the client can do to prove ownership of the domain.
So get challenges, perform challenges, and then cleanup challenges.
Man, this is hard to read.
Not going to go into detail, but once the client proves ownership of the domain and gets a certificate, it has to do a lot of work to deploy the certificate and automatically configure the server to use HTTPS.
So this is the interface where it gets the cert, installs it, and does things.
It has features where if this somehow messes up your server configurations so your site no longer works, you can say revert and roll back, et cetera.
But anyway, so the result of all this is that to set up as a cell with Let's Encrypt, you just git clone Let's Encrypt, or download it from your package manager, if it supports it.
And then you run Let's Encrypt auto to install.
And to install TLS on Apache, all you have to do is say, Let's Encrypt Apache.
And it'll show you like a curses interface where you can pick the domains you want.
You can also specify a command line flags to say, set up SSL for these domains if you know which domains you want SSL for.
And renew or revoke will be just as easy.
You can just say.
Let's Encrypt renew, Let's Encrypt revoke.
The Python client does all of that work.
So that's it.
Make the internet great, again.
You can help out on GitHub, Let's Encrypt.
Let'sEncrypt.org is the site.
Hopefully, that was under 20 minutes.