Real-time recompilation of running JavaScript by Peter van der Zee

Watch high quality video on Vimeo
Download video (MP4, 110MB)

So yeah.

That is indeed me.

So as Paul said, the first time on Fronteers I actually learned about Fronteers about a month before the conference.

This was a second conference.

And so I won a ticket by Microsoft.

Because otherwise I wouldn't even have made it.

I was just too late because the conference was sold out.

So I'm actually in this audience, and, well I'm like right there.

That's about the only picture I can find.

I'm pretty good at dodging pictures apparently.

And so the next year, I figured I'd step up.

And so I was part of the crew.

Again, one a few pictures I can find of myself.

At least its on stage.

The year after that as well.

And then yeah.

So I'm Peter.

As Paul introduced me.

I indeed I curate [? Jazz1K ?], I also wrote a static analysis tool called Xeon JS as well as a parser and hidden app profiler.

Least those are the recent projects.

I work for a company called Surfly.

We do remote desktop in the browser.

Basically just sand boxing the entire browser.

We use JavaScript but as well as HTML, CSS, and the back end.

So that's pretty cool, but I'm not going to demo this to you today.

Because today I'm going to talk about real time recompilation.

So let's start with a demonstration.

And demonstrations are always scary.

So this is a simple editor.

Let's see how it works.

It's a text editor I wired it up, if I edit it I can.

Let's just restart it for the sake of it.

You'll see that it flashes.

Right? It will restart the entire code.

And it will just well restart basically.

So you can see in the console it will log out whenever I restart it.

So for instance if I make this red thing blue, there are flashes, and it doesn't continue the code.

So I have a second version here, which is rewired to something that is actually capable of recompiling the running code.

So when I do this, you'll see that it re-compiles, but it doesn't actually restart.

Of course, no visual changes, so it's not that impressive.

So let's try with this black rectangle.

So if we make it say green, now it's green.

It didn't restart the app.

The rectangle kept on turning.

There's no state magic going on there in the background.

So don't worry about that.

And I can change it to anything I want.

If you've seen the talk by Fred Victor, [? Invents ?] in [? Principal ?], you'll see that this is very similar to what he did.

It kind of inspired me to do this in the first place.

I can make more rectangles.

The screen is a bit small here.

So they don't really fit.

But I can make them orange, green, whatever.

So the point is, it doesn't actually restart the code.

Everything kept spinning.

It didn't flash.

It didn't change much.

Of course, this can put the state of your document in a state you don't really want to.

So sometimes you do have to refresh.

But that's fine.

Just to wrap up.

Its a compilation of JavaScript on the fly.

You maintain access to closures.

So my approach is safe for that.

Same for argument, same for everything.

It does lazy evaluation.

So it's fairly optimized.

And it's a generic approach, meaning that it should be capable of recompiling any JavaScript.

There might always be extra cases.

But as far as I know, there's only one big problem.

But I'll cover that later.

So here comes the dangerous part.

The main gist of this talk is to start with a simple line of code, goes all [? the ?] log, something, and end up with something that we can actually recompile from within the browser.

And then later I'll show you the abstraction.

All right.

So I did a simple HTML page.

Right.

There is nothing special going on here.

And this is the output in the browser.

So we start with a setTimeout.

And we do a console… console.log

NOTE Paragraph FronteersConf… exclamation mark… And then let's just every five milliseconds, we'll see this once.

Because it needs to [? be a set ?] interval.

Yeah.

All right.

So the first step is we need to create a function that returns another function.

Because we're on to be re-compiling stuff, and this needs a function.

So let's get function get function, and it's going to return this function.

So still nothing special, just a refactoring of what was already there in the works.

So the next step is to actually start compiling the code.

So when I say compiling, what does it actually mean? So compiling itself means translating one source code to the other.

Right.

In JavaScript we tend to call these transpilers, say a Python to a JavaScript or whatever.

But in the end, they are really just compilers.

But what I mean compiling right now I actually mean preparing code to be executed.

And you do this by just simply creating a function.

So whenever you create a function, it's actually preparing code to be executed.

And on the water in the browser will actually compile this body of code.

You can't really inspect it.

So you can't really see what the bytecode is.

But it's there, trust me.

So in JavaScript, how do we actually create functions dynamically? There is an easy one.

We can do eval.

But there is a semantically more appropriate one, and that's called Function.

So we have a built in function called function.

I know that's confusing.

And it accepts a string.

Actually it accepts multiple strings.

So what you have is this is the function.

Let's do this in a new tab.

So we have function, right? It's built in.

It's given to you by the language.

And it accepts parameters.

And then whatever body you have.

This is probably going to return in a syntax error.

So this return is a function that has parameters A and B and then whatever body I supply to it.

So that's what we're going to use because that's what we need.

So we're going to change this into-- I've already changed it.

So this should do it, right? Because it's going to return a function and well, let's see what it does.

Did I do everything properly? No, there's a syntax error right there.

So let's remove that tab.

All right, so still works.

Still really just no magic yet.

So the next step is a bit of abstraction.

We're going to expose a global variable called whatever data.

It doesn't really matter what the name is as long as it's globally accessible.

I'm going to put this string in right in there.

I'm making an array because we will want to support multiple functions.

And so now this will create a new function calling the data the first element in the data array and then generate a function based on whatever string is in there.

And it still works.

All right so the next step is to try and recompile.

Because I think we're at a stage where we might be able to recompile.

All right, I mean we have global access to this data structure.

There is a function that recompiles or compiles the function.

So I should be able to recompile.

So let's try that.

So [? at a 0 ?] equals the string that's going to invoke a console.log.

Oh, fail.

The reason it failed is because this only executes once.

So let's log our compiling, and we'll see that it just logs our compiling once.

So if I do this, it obviously isn't going to change the output.

Because the function hasn't changed because it hasn't even been recompiled to change this.

We can return a function.

So this is going to be a bit of exception.

But we're going to return a function that returns a function basically.

So now take note that OK, so we have a set interval.

This interval is going to call get function to get this function to interval over.

Right? So it's going to repeat repetitively the function that get function returns.

It will return this function.

This function in turn creates a new function, and we're going to invoke that function immediately.

So that way, it's continuously going to compile the function and then immediately invoke it and return whatever its result is.

I hope.

So now we can see that it is repetitively compiling.

So let's see if I can change it now.

Voila.

This is re compiling at its most basic form.

All right.

So we have running code, we change the code without actually restarting the application.

It's most basic.

Let me remove that.

So now that we have this, how can we improve on it? Because this isn't generic yet.

This isn't the full solution.

You're going to run into certain problems.

The first one is going to be functions with arguments.

Because if we have an argument here, say, text, doesn't really matter.

Let's say, let's change the example.

So we have our speak equal to function that accepts some text to speak.

And then a console.log

[? text. ?] All right, and then

we have a set interval again to basically calls a function that does speak.

So this is basically an example of a function that accepts an argument.

Yes.

And [? Will ?] [? Wolf ?] and his works, of course, because it's not part of the system yet.

But it won't work right now because it won't be passed on.

So that's at least one problem.

There are more problems right now.

For instance, closures.

Whenever you create a new function, you will lose whatever access that function had to the scopes.

And a closure is nothing else than a function that still has access to scopes that are not his own.

So if you return a new function, it will have access to different scopes, to different variables.

And since most programs, especially ones that could use this approach, will have use some kind of closure, this will break.

So we need to support closures.

Apart from that there are function declarations.

Function declarations have the problem that they don't actually return a value.

So I can't just go and replace them with get function because that will just break the code.

Function hoisting will screw me over before I can actually get something done.

Named function expressions are slightly different in that the name of the function expression is actually special.

It's only accessible from within the function.

And when you're inside a function, you can't even write to it.

So even if I would be able to change it, I can't say get function and assign it to that variable.

Because it will be read only.

And then, of course, there's performance.

Because if you do this for every call at every time, you are going to drag down a browser.

So let's start, let's do this step by step.

We'll start by function arguments.

So the function arguments is this is the example that I'm going to work with.

There's not a lot of screen space here.

So anyways, it's up there in case I need it.

Let's first do the translation back into this data structure.

So I'm going to just convert it functions one by one, basically top done inner to out and put them in this data structure.

And I'm going to replace this with a get function.

I missed a step.

All right, and then the inner of this function.

I need to remove this, of course.

And then I need to add a function ID because I need to know which function to compile.

And I replace my zero with a F ID.

So I have now in my data structure two function bodies.

One does need console.log of text.

And the other one called speak.

I have a speak which gets assigned the console.log text,

and I get a function for the setinterval that is actually called speak.

All right, so let's see if this works.

Let's go to the right browser.

Uh oh.

No text.

So the reason is that there is, in fact, no text.

Because if I do function on [? daz ?], let's try that in the browser.

New tab.

Don't do this.

All right, so if I do this, basically that's what's happening right now, right? I get this function back.

And this text, there is no perimeter anymore.

Because as I showed you before, function accepts multiple parameters.

And it's leading parameters are all arguments.

So if I just call function without any leading arguments, I'm not going to compile the function with arguments.

So I need to change that.

Let's try that.

We need to change the data structure a little bit to accommodate for the arguments.

I'm just going to put them in a single string which will be fine for this demo.

And then the body, of course.

I'm closing off the object literal and the another one because we have two functions.

But this function doesn't have any arguments.

So we can use the empty string.

And then body for the “Woof!”.

All right.

Now we need to change it here because we need to add this args.

So that will be the first parameter.

And then the second will be body.

Let me put them on separate lines so you can see them.

All right, and so then they will be immediately invoked So now I am creating a new function that accepts a parameter text.

So this should go fine now.

And I'm still calling the function with woof.

So let's see if that works.

No.

It doesn't do anything.

That shouldn't actually have happened.

Am I missing something? I'm in the wrong tab.

Thank you.

There you go.

Unfortunately, that didn't fix it though.

And the reason is that I said it's going to actually invoke it immediately.

Right? But it doesn't actually pass on the argument.

So this function is the function that is actually going to be called by set interval.

All right.

So I do get function.

And it's the same for speak, by the way.

So whenever I do speak, I actually called that function.

And that's going to return whatever this compiles and executes.

But this is going to accept, this function is going to accept all the arguments, the original arguments anyways.

So I can try to do like A,B, C, D here.

Of course, that's not going to work.

Because the function might have an arbitrary number of parameters, and we need to support it generically.

We need to support any number of arguments that are going to be passed on.

So what we can do is we can use to JavaScript language because we can do apply.

And then we can pass on the context.

Now if you were at my workshop yesterday, you probably now know what this actually does.

And then I'm going to pass on the arguments.

And I'm going to just use the pretty straightforward array prototype [? slice ?] [? colon ?] arguments zero.

It's fairly straightforward.

Come on.

I can also just do this, but it's not-- you can't really pass on arguments itself because it's not an actual array apply requires an actual array.

So that's why I have to actually slice it up.

And since arguments isn't an array, I can't just go do our arguments a slice.

It just won't work.

So that's why I have to do this elaborate yada yada.

Luckily, I think they will fix this in the next version of JavaScript I mean.

Anyway, that's not very useful right now.

So I'm now passing on the context which is not relevant to this demo, but it's going to be important for generic code.

As well as all the arguments as they were originally intended to be passed on.

So now it really should work, yeah.

So OK, arguments are working.

Can I recompile them? Let's see if I can.

So I need to do is zero body now.

This is my text.

So I'm going to change that to console.log

to just use this parameter.

I'll save it.

Yeah. it works.

OK.

So now I'm able to support arguments properly and I can recompile and do anything with those arguments.

That was, I wouldn't say simple, but I mean it's not solving my other problem, namely closures.

Because an argument is going to be freshly created whenever the function is executed, and it's going to be destroyed when it closes or when it ends.

So the next step is actually to start supporting closures.

And we can do that.

But first we need to create a new example.

Because this example doesn't use any closures.

And for a demo, we do.

So we're going to do speak as function.

And this function accepts the text again.

And then rather than starting from the set interval, this is going to start a set interval.

So I'm going to do a set interval.

And then inside of the function, I'm going to do console.log text.

And, of course, we need to start it by saying the Dutch word for wolf.

So this argument or variable is going to be in a closure.

Because as I said before, this function will live on and will remain to have access to its outer scopes.

So as long as this function lives, and it will live because an interval maintains or retains it.

As long as the function lives, this variable will be accessible for it, and so there will be a closure.

Also means that if I just start replacing this function on the water, I'll lose that connection.

So there will be a new function, and it won't have access to this variable.

So this is going to work because we haven't translated it yet.

But as soon as I start translating it, let me do that right no, I will lose that.

It will not work, definitely not work.

So there is the first-- of course, it didn't change.

But there is no longer a parameter to the first function.

Because that goes to the second function, actually the third function, I think.

All right.

So I do this.

And then I do get function zero.

And then I wrap the outer function.

So I'm going to wrap the get function call in another get function call.

This and that function will have a text argument.

And so this is the translation, and now I can do it with just replace this by get function.

And it's going to be get function one.

All right.

So now we have, we still the original code, sort of.

And as long as we don't recompile it, it will be fine, I think anyways.

No.

And the reason is I see.

I did do set interval.

I did do get function.

I pass on all the arguments.

I do pass on all the assignments here.

And this is get function zero.

I'm sorry.

That's supposed to happen because this is the closure that we created, right? Well, that sucks.

All right, let me just briefly try this again.

Text.

Set interval.

Function.

Oh, wait, was I in the wrong tab again? No.

Oh.

That would have been nice though.

That would have been a nice save.

And then console.log text.

And then speak text.

Oh, well, whatever.

It doesn't really matter.

This is the function.

So this works again.

Yes.

And then I'm translating it, so I create my outer again which is the same thing.

And then the function is replaced by a get function call.

Actually, I remember now.

Do I? Yeah, I think I do.

Whatever.

I hate it when I realize too late what the actual problem is.

OK.

I realize what the problem was.

So the reason is simply because this approach is not complete yet for closures.

So it won't even-- I know this is the wrong one.

Yeah.

OK.

Glitch in the matrix, unfortunately.

Right.

So the reason is that is it doesn't actually have the link to the text anymore.

Because we are still compiling it through a function.

And well, that's just not going to work anymore.

Because we need a closure.

We need a closure over text.

And that's not going to give it to us.

So we need to add some magic.

The magic comes from eval.

So JavaScript has an eval function.

And eval basically allows you to send arbitrary JavaScript and execute it immediately.

So you can dynamically start running code.

It will drastically deoptimize your browser.

But for certain purposes like this one, it's going to be ideal.

JavaScript has like two different ways of doing this eval.

Right? You have direct eval and indirect eval.

And I normally don't really care about this, but in this specific case, I really do.

Because the main difference for this context at least is that it is able to access variables in the scopes that surround it.

So if you do a direct eval, it's able to access variables that are in the outer scopes of the current scope.

Whereas if you do an indirect eval, you only have access to the global scope.

In JavaScript, the rules are fairly straightforward for direct eval.

Namely, the function that you call must be named eval.

So I can do var foo as eval and then foo whatever.

And that works.

But it won't be a direct eval.

No I won't even bother trying to prove that.

However, if I do something like this, I have an argument named eval.

And then I do eval code whatever.

And I pass on eval.

You'll notice that it's now a local variable called eval.

So it's not the same variable.

But since it's still contains the original eval function, it will still be a function, a direct eval.

So it will still have access to all its closures.

Now that's important for this case.

Any other way like using function or like doing a set interval with a string or the other way or you just do window to eval or the above case where I just assign it to a temporary variable will all result in an indirect eval.

So those are all approaches that we can't use.

So back to the demo.

So I need to change this to eval because now right now, I'm doing indirect eval, I lose my access to all my scopes.

And I need those because the originals, the original function had access to scopes.

So let's change it.

Now I'll need to add the sort of boilerplate for function creation again.

I will and the arguments here.

No longer a comma.

Then close the parentheses, opening body.

And then we move that.

And then closing this function again, and since we're in eval, and you can't really start a statement with a function keyword, I have to wrap it in a parenthesis.

Because I want to return this function rather than a function declaration.

All right, so now I should have a function that does an eval which returns a new function which has my arguments.

And this is why I just make one string because if I have multiple arguments, I can just do A, B, C.

So that would be fine.

So now the new function will have this body.

And then I'll do the same apply and context passing on and whatever that I did before because that's still relevant and valid.

So I'm doing eval now, right? I'm doing direct eval so my function should have access to old scopes.

Great.

This is still correct.

And that's still correct.

So let's see if that works.

Right tab.

No.

Wah, wah, wah.

Luckily i do know why.

The reason is that we're doing eval in this function.

That means it has access to only variables in these scopes.

This function scope, this function scope, and global scope.

It doesn't actually have access to this text argument because it's a completely different scope.

That's a bit trickery.

Because we need to solve that, of course.

Let me prove it first that we have access to this scope by saying how far S is.

Foo.

And then logging out S. All

right, so in my data structure there is no variable S anywhere, right? So this should resolve in a syntax error.

But it's not.

It's logging off Foo.

Because we're doing this direct eval that access to this scope.

In fact, if I remove it, now it will throw S is not defined and completely crash the code.

So I need to do eval somewhere else.

But if I pass on eval from within my code, like if I do, for instance, something like this, I will have an indirect eval.

Or I can make it direct again, but it will still be local to the scopes that I'm currently calling it.

This is the big trick because I can't actually work around this by creating a compiler function in each function.

So the name itself, of course, is arbitrary.

It's doesn't really matter as long as it's unique.

It doesn't really clash with the existing code, and you'll know what it is.

I'll create a function that returns eval on S and then the closing S I might call it.

It's more legible if it's one line.

Maybe I can actually fit it in one line on screen.

Oh, I can.

That's good.

So this is my compiler function.

Right.

So every scope, every function I generate now will have a variable called dollar sign compiler and it's able to compile code with direct eval in the scope that it originally occurred in.

Meaning, I can actually replace the function now.

So I need to do some more [? budget ?] play because I need to pass on this compiling function, compiler function, whatever you want to call it.

So get function will receive a compile function now which is equivalent to what I'm creating here.

And I need to pass it on here as well.

And, again, every function will have access to this variable name.

So I won't have to worry about it not existing.

Oh, it was right there.

Oh, I see.

No.

Well, you know live coding, ain't it? Luckily this was like the end of the life coding part anyway.

You'll just have to take my word for it or just check out the get repo.

And so let's continue with the slides.

Basically, what it's doing, this approach is it wraps the original code in get function calls.

And as you see in the part, it will cash that function to be evoked.

And then every time that you call the function, it will check whether there's a new version.

If there's new version, recompile it, then run that code.

Otherwise, run the compiled version that we already had.

The approach is safe for closures.

It's safe for arguments.

It's safe for weird, syntactical otherwise problems.

And well, I don't know, it works.

So there are two open issues.

One is minor because you have to insert a new variable.

I think technically it's possible to get around this problem.

But that would entail cluttering the cope with evals all over the place.

So I think it's fairly simple to introduce new variables that you can check that are not used already.

So that I think is a minor problem.

A big problem is inserting new functions because as long as your function order is the same, you can statically determine which function maps to the previous function.

And as soon as you introduce new functions, you will change this order.

And so you will have to figure out where does this new function fit in? And where does the old function match or which old functions match to which old functions? You have a problem there.

And I think it's solvable.

But it will be computationally expensive as well as difficult to attend because if you have a function that has a closure, you need to make sure that you recompile the proper function back into that function with a closure.

Because otherwise you might screw things up.

So inserting new functions is a big problem for this approach.

Or at least it is right now for me.

I don't have a solution for it.

And I haven't for a while.

So I'm not sure if I can solve this.

The other problem is this approach is just very hard to explain.

And not just like the talk I did today, but also conceptually, it's difficult for people to see the fact that for instance, if you recompile that your static variables will not change.

So if you change anything in global scope, it will not be reflected in your compiled code.

If you change anything in a function that's not invoked again, your change will not be visible.

So while it looks like it's changing, for instance, you updated your speed parameter, it won't be actually live until to the function is invoked again.

That's why I use the set interval.

Because that will repetitively call the function that will make the change apparent.

If I had just done console.log, you

wouldn't have seen the changes.

It wouldn't have been as exciting as it was now.

Well, you know, console.log.

Anyways.

All the code, and this talk, and slides, and everything isn't GitHub.

These slides earned but they will be soon.

You can check them out.

You can play with the code.

There's an abstraction library right there.

And well, you can knock yourself out there.

Yeah.

So that was my talk.

Join me over here, Peter.

Yeah.

First up, is a question from me.

Next time, do you think you could talk on a subject that's more complex maybe? I just felt like it was about here.

Was it? I was like, duh.

Yeah, oh, sorry.

I do my best , you know.

One of the questions that came up a little bit was around kind of how can we take these techniques and apply them in the real world? So what did some of the use cases that library compilation in JS unlocks? Yeah, I have to concede there that there aren't that many.

Unfortunately, it's a nice gimmick, it's a nice hack.

It works.

It's a proven concept.

Especially as long as those function replacements aren't fixed, you can't really use it.

Because as soon as you start typing a new function, it crashes, of course.

Apart from that, it will really just be usable for stuff like visual applications that continuously repaints the canvas.

Because if you repaint the canvas, you'll see the change.

If you have a game, you can see the change.

But if you just have say a web page with a callback, you will only see the change when you actually press a button.

Or maybe not even at all if it is like a one time run code initialization or whenever.

So it's really just for a certain tooling and even then you have to know what you're doing because it's very hard to relate to the user how you can and cannot use this approach.

I think Sebastian [? Golash ?] also pointed out, one use case that make sense which is like the JS Bin, sorry.

Exactly.

[? -JS Bin, ?] [? JS Vidal ?], those sorts of things would-- Yes, but even for those applications, it will be very difficult to relate to the user what they can and cannot do.

Even after like function is fixed, if you have a fixed variable in global scope, you can't change it this way.

So even for [? Yasbin ?] or whatever users, it's difficult.

Yeah, so this is like a-- It would be like a mode that you turn on so that you actually know what's going to.

Yeah, so similarly there is similar library compilation inside of Chrome and the dev tools-- Exactly.

But I think they hook into the debugger, right? Yeah.

It's not the same technique as yours.

It actually goes straight to V8.

But it is cool.

But it's the same sort of instance where you have an object and you change like a property on the object, and that will not change.

Like it has to basically be at a method level.

Yeah, that's awesome.

Actually, I think the approach is even better than what I can do.

Simply because they can hook into the debugger, and they can immediately see which function belongs where, et cetera.

But it works really good.

I've used it before.

Is eval evil? Douglas would say yes.

I think as long as you know what you're using it for, its fine.

But don't just go using it because you can.

Because you have to have a reason for it.

Is Is the function constructor any less evil? It's the same as eval.

So why would it be less-- It's the same, right? It's the same.

It's an indirect eval.

Because I remember this time when jQuery swapped.

jQuery, oh, it's like [? parse ?] [? JSON. ?] Inside

[? parse JSON. ?]

it was using eval, and then they switched to the function constructor.

And I was like, oh, I feel better about this now.

It's just looks better semantically because it says function.

So it's not eval, right? On the water, really, it just is eval.

That's the same for set timeout for a string and certain other ways of doing it.

OK, cool.

Thank you very much, Peter.

This was awesome.

Thank you.

0 comments
Comment

Real-time recompilation of running JavaScript by Peter van der Zee

Search fronteers.nl

Fronteers 2013

Elsewhere

Fronteers

Stay updated