Using Google Caja to run user-supplied Javascript

Using Google Caja to run user-supplied Javascript - javascript

It appears that the official examples use a caja.js file that just wraps an iframe to load an URL from a server hosting a caja compilation service, which in turn, gets its input from some URL. The relevant API for that is available here.
However, what I really want is to just safely (and repeatedly) run a user-supplied piece of Javascript, like so:
for (var i = 0; i < N; ++i) {
var x = getUserResult(currentState);
updateState(currentState, x);
}
Is there any way to do this directly? The code here has the compiler. Why can't I just use that to compile the code and then run that within an emulated context? Is it because the only way to get a safe context in a browser is an iframe? And, if so, is there any way I can use an iframe to directly run given source code, without having to fetch it from an external URL?

Caja needs an iframe no matter what. Both modes of execution require a set of JavaScript globals (obtained by creating the frame) which is available to be radically modified to enable safe execution.
Modern Caja (ES5 mode) does not require any server-side compilation step; provided the browser is compatible you can use Caja in the standard way and the server will never be contacted. To force this, specify es5Mode: true in the options to caja.initialize.
You can load guest code once and repeatedly execute it; just provide an api which lets the guest pass a function out when it's loaded, then call the function whenever you like.
For your use case, it would also be possible to use SES, the modern safe-eval subsystem of Caja, without using Caja itself at all; this would allow you to skip having any iframes, but would require you to write your code in a SES-compatible way; that is,
refraining from modifying global objects such as Object.prototype, and
protecting all objects directly or indirectly exposed to the user-supplied code using Object.freeze().)
If you're up for it, I do recommend using SES directly, as it removes a lot of indirections and total complexity, but it does require understanding the concepts to succeed at safety.

Related

How do the Render Engine and JavaScript Engine Communicate in a browser?

I'm looking for a detailed answer on this.
What I already Know
So I have some understanding about the call stack and callbacks, and that the browser add functionality through web APIs which add callbacks through the event loop. I also read somewhere about the JS engine having an API.
What I want to understand
How are the web APIs exposed to the JS engine? (If this is about the
JS engine having an API, some description on how that API works would
do)
How can the behavior of the Render Engine be manipulated through
JavaScript, like manipulating the DOM, CSSOM etc.? (If I understood
correctly, this is the equivalent of asking how web APIs work)
Thanks!

From a (C++ etc.) application development perspective, JavaScript engines are embeddable libraries; and a browser is one such embedder. Any library defines a public interface through which it can be used -- its Application Programming Interface (or API for short). There is no standard for what a JS engine's API should look like; each engine defines its own, and evolves it as necessary over time. V8's is here.
The core functionality of a JS engine's API is to allow the embedder to provide objects and functions to the JavaScript environment that are backed by the embedder's own C++ implementations. Essentially, this defines a mapping, sometimes also called "bindings". For example, the embedder can say "I want there to be a document object, and it should appear to have a property .location that's backed by my getter function DocumentLocationGetter() {...}, and it should (appear to) have a method .createElement() that's backed by my other function DocumentCreateElement(...) {...}", and so on.
And that's the answer to both of your questions: the browser exposes certain functions to JavaScript that can then be called from there. The browser decides what to do when such a function is called (e.g.: add or remove a DOM node, change a CSS property, store an event handler in some element's event handlers list, ...). Of course the browser/embedder can also call into the JS engine, for example when invoking an event handler, it can tell the engine "please execute function button1_clicked now".
For more details, see e.g. v8.dev/docs/embed.

How can I sandbox code in an application with dynamically loaded, untrusted modules?

I'm making a game in Electron and I want it to support mods. Sometimes those mods will need to use custom logic, which leaves me with the issue of dynamically loading their code and I'm finding it hard to come up with a way to do that securely.
What I've considered
Ideally, I'd like to execute the mod scripts while passing just the few safe game objects they need as parameters, but that seems to be impossible (no matter the solution, that code can still access the global scope).
Most importantly, I have to prevent the untrusted code from accessing the preload global scope (with Node APIs), so require or anything else done in the preload is out the window.
Therefore that code has to be executed in the renderer.
My solution so far
I can either read the files in preload using fs or directly in the renderer using fetch. I have nodeIntegration set to false and contextIsolation set to true, and trusted code loaded by the preload script is selectively passed to the renderer through a contextBridge. The code which accesses Node APIs is properly encapsulated.
Unfortunately, that still leaves me with having to execute the unsafe code somehow, and I don't think there's any other way than to use eval or Function. Even though malicious code could not access Node APIs, it would still have full access to the renderer global scope, leaving the application vulnerable to, for example, a prototype pollution attack.
To sum up:
The safer place to execute untrusted code is clearly in the renderer
There is no alternative to using eval or Function
This leaves the renderer global scope vulnerable to attacks which I can try to mitigate but can never make it completely safe
My first question: are these assumptions true, or is there a better way to do it?
The risks and how to mitigate them
So the potentially malicious code has access to the renderer global scope. What's the risk?
Well, any sensitive user data will be safely stored in the preload, the same goes for access to the user's computer with Node APIs. The attacker can break the game (as in, the current 'session'), but I can catch any errors caused by that and reload the game with the malicious mod turned off. The global scope will only hold the necessary constructors and no actual instances of the game's classes. It seems somewhat safe, the worst thing that could happen is a reload of the game.
My second question: am I missing anything here regarding the risks?
My third question: are there any risks of using eval or Function that I'm not thinking of? I've sort of been bombarded with "eval bad" ever since I've started getting into JS and now I feel really dirty for even considering using it. To be exact, I'd probably be using new Function instead.
Thank you for reading this long thing!

There is no general solution for this, as this heavily depends on the structure of the project itself.
What you could try is to use espree to parse the unsafe code, and only execute it if there is no access to any global variable.
But that most likely will not prevent all attacks, because you might not think certain other attacks that might be possible due to the way the program is structured, require (or any other way to include/load other scripts) in that unsafe code could also open side channels allowing certain attacks.
eval and new Function are not bad in general, at least not as bad as loading/including unsafe code in any different way. Many libraries use code evaluation for generated code, and that's the purpose of those functions. But it is often misused in a situation in which there no need for that and that is something that should not be done.
The safest way is most likely to run the code in a WebWorker and define an API for the Mods to communicate between the mod and the application. But that requires to serialize and deserialize the data, when passing it form the app to the mod and the other way round, this can be expensive (but this is what is done e.g. with WebAssmebly). So I would read a bit how communication is solved with WebAssembly.

Javascript: Where getter/setter values are stored? [duplicate]

I was thinking about this today and I realized I don't have a clear picture here.
Here are some statements I think to be true (please correct me if I'm wrong):
the DOM is a collection of interfaces specified by W3C.
when parsing HTML source code, the browser creates a DOM tree which has nodes that implement DOM interfaces.
the ECMAScript spec has no reference of browser host objects (DOM, BOM, HTML5 APIs etc.).
how the DOM is actually implemented depends on browser internals and is probably different among most of them.
modern JS interpreters use JIT to improve the code performance and translate it to bytecode
I am curious about what happens behind the scenes when I call document.getElementById('foo'). Does the call get delegated to browser native code by the interpreter or does the browser have JS implementations of all host objects? Do you know about any optimizations they do in regard to this?
I read this overview of browser internals but it didn't mention anything about this. I will look through the Chrome and FF source when I have time, but I thought about asking here first. :)

All of your bullet points are correct, except:
modern JS interpreters use JIT to improve the code performance and translate it to bytecode
should be "...and translate it to native code". SpiderMonkey (the JS engine in Firefox) worked as a bytecode interpreter for a long time before the current JS speed arms race.
On Mozilla's JS-to-DOM bridge:
The host objects are typically implemented in C++, though there is an experiment underway to implement DOM in JS. So when a web page calls document.getElementById('foo'), the actual work of retrieving the element by its ID is done in a C++ method, as hsivonen noted.
The specific way the underlying C++ implementation gets called depends on the API and also changed over time (note that I'm not involved in the development, so might be wrong about some details, here's a blog post by jst, who was actually involved in creating much of this code):
At the lowest level every JS engine provides APIs to define host objects. For example, the browser can call JS_DefineFunctions (as demonstrated in the SpiderMonkey User Guide) to let the engine know that whenever script calls a function with the specified name, a provided C callback should be called. Same for other aspects of the host objects (e.g. enumeration, property getters/setters, etc.)
For the core ECMAScript functionality and in some tricky DOM cases the JS engine/the browser uses these APIs directly to define host objects and their behaviors, but it requires a lot of common boilerplate code for e.g. checking parameter types, converting them to the appropriate C++ types, error handling etc.
For reasons I won't go into, let's say historically, Mozilla made heavy use of XPCOM for many of its objects, including much of the DOM. One feature of XPCOM is its binding to JS called XPConnect. Among other things, XPConnect can take an interface definition in IDL (such as nsIDOMDocument; or more precisely its compiled representation), expose an object with the specified properties to the script, and later, when a script calls getElementById, perform the necessary parameter checks/conversions and route the call directly to a C++ method (nsDocument::GetElementById(const nsAString& aId, nsIDOMElement** aReturn))
The way XPConnect worked was quite inefficient: it registered generic functions as callbacks to be executed when a script accesses a host object, and these generic functions figured out what they needed to do in every particular case dynamically. This post about quickstubs walks you through one example.
"Quick stubs" mentioned in the previous link is a way to optimize JS->C++ calls time by trading some code size for it: instead of always using generic C++ functions that know how to make any kind of call, the specialized code is automatically generated at the Firefox build time for a pre-defined list of "hot" calls.
Later on the JIT (tracemonkey at that time) was taught to generate the code calling C++ methods as part of the native code generated for "hot" paths in JS. I'm not sure how the newer JITs (jaegermonkey) work in this regard.
With "paris bindings" the objects are exposed to webpage JS without any reliance on XPConnect, instead generating all the necessary glue JSClass code based on WebIDL (instead of XPCOM-era IDL). See also posts by developers who worked on this: jst and khuey. Also see How is the web-exposed DOM implemented?
I'm fuzzy on details of the three last points in particular, so take it with a grain of salt.
The most recent improvements are listed as dependencies of bug 622298, but I don't follow them closely.

JS calls to DOM methods like getElementById cause the JS engine to call into the C++ code that implements the DOM. For example, in Firefox, the call ends up in nsDocument::GetElementById(const nsAString& aId, nsIDOMElement** aReturn).
As you can see, Firefox maintains a hashtable that maps ids to elements in C++ as an optimization in this case, so it doesn't walk the whole DOM tree looking for the id.

The DOM is implemented as a language-independent library pretty much in all major browser implementations, which means it's in a different library from the Javascript engine. For example in IE, the JS engine is implemented in jscript.dll while the DOM is implemented in mshtml.dll. Safari has Nitro(JS) and WebCore(DOM). Chrome has V8(JS) and WebCore(DOM), and Firefox has SpiderMonkey/TraceMonkey(JS) and Gecko(DOM).
What this means is that anytime your JS has to access the DOM, it has to reach over to the DOM library - which is inherently slow because of all the marshaling that has to take place. An analogy that has been used is 2 pieces of land connected by a toll bridge, any time you touch the DOM, you must cross over the bridge and cross back - paying a performance toll.
References
Video: Building High Performance Web Applications and Sites
Book: High Performance Javascript (Chapter 3 on the DOM)

intercepting javascript alert()..? is it acceptable?

I just found we can intercept the javascript alert() native call and hook the user code before the actual execution. check out the sample code..
function Test(){
var alertHook=function(aa){
this.alert(aa);
}
this.alert("aa");
this.alert = alertHook;
alert("aa");
}
so everytime i call alert("aa") is been intercepted by my alertHook local function. But the below implementation with the small change does not work.
function Test(){
var alertHook=function(aa){
alert(aa);
}
alert("aa");
alert = alertHook; //throws Microsoft JScript runtime error: Object doesn't support this action
alert("aa");
}
it throws Microsoft JScript runtime error: Object doesn't support this action.
I dont know how this.alert = alertHook; let me intercept the call, but alert=alertHook; not.??
So i assume using this to intercept any native js methods.? is that right?
And is that acceptable? because this way i can completely replacing any native JS calls with my own methods??
UPDATE:
I asked is that acceptable? because how this is a good approach having eval() and letting users to replace native function calls?
And its responsibility of a language to protect developers from the misleading features, replacing the native js calls in a window level(or in a common framework js file) would crash the whole system.. isn't it??
i may be wrong in my opinion because i dont understand the reason behind this feature..? I never seen a language that let developer to replace its own implementation..

Depending on how Test(); is being called, this should be the window Object.
I believe Microsoft allows overwriting native JS functions only by specifying the window object.
So window.alert = alertHook; should work anywhere.
is it acceptable?
Yes it is. This is a major strength for the flexibility of the language, although I'm sure there's better alternatives instead of overwriting native behavior.
Overwriting native JavaScript functions isn't really a security issue. It could be one if you're running someone elses code that does it; but if you're running someone elses code there's a lot of other security issues you should be concerned about.

In my opinion, it never is good practice to redefine the native functions. It's rather better to use wrappers (for instance, create a debug function that directs its output to alert or console.log or ignores the calls or whatever suits your needs).
As for why JScript throws an exception with your second example and not the first one, it's easy. In the first example, you create a property called alert in your local scope, so when you refer alert you'll be referring this.alert rather than window.alert. In the second example, the alert you're referencing is the one from window, so assigning a different function to it will fail.

And its responsibility of a language to protect developers from the misleading features, replacing the native js calls in a window level(or in a common framework js file) would crash the whole system.. isn't it??
Not true, replacing the native call only hooks into it, replaces it: it does not rewrite the native at all. Crashing the "whole" system; JavaScript runs in a Virtual Machine, it's interpreted, so the chance of crashing the "whole" system (i.e. Blue Screen of Death?) is very very small. If so: it's not the programmers fault, but the implementation of JavaScript which is causing the error.
You can consider it as a feature: for instance, if you load a JavaScript from someone else's hand, you can reimplement some functions to extend.
Protection to the programmer is like keeping a dog on the leash: only unleash it, when you trust the dog! Since JavaScript runs in a Virtual Machine, any programmer can be unleashed -- if the implementation is secure enough, which it is (most of the time?)

What does XPCSafeJSObjectWrapper do?

What does Mozilla Firefox's XPCSafeJSObject wrapper actually do?
MDC's documentation is as follows:
This wrapper was created to address some problems with XPCNativeWrapper. In particular, some extensions want to be able to safely access non-natively-implemented content defined objects (and to access the underlying JavaScript object under an XPCNativeWrapper without its strong behavior guarantees). XPCSJOW act as a buffer between the chrome code.
This doesn't tell me a lot. In particular, I can't tell how accessing objects via XPCSafeObject is any different to accessing them directly.
Edit: I understand that the purpose of the wrappers in general is to protect privileged code from unprivileged code. What I don't understand (and doesn't seem to be documented) is how exactly XPCSafeJSObject does this.
Does it just drop privileges before accessing a property?

Actually XPCSafeJSObjectWrapper is used for all content objects, including windows and documents (which is in fact where it's most usually needed.) I believe it was invented mainly to stop XSS attacks automatically turning into privilege escalation attacks (by doing XSS against the browser itself). At least now if an XSS attack is found (and people will unfortunately keep looking) it doesn't compromise the whole browser. It's a natural development from the XPCNativeWrapper which was originally a manual (and therefore prone to accidental misuse by extensions) way for the browser to defend itself from XSS attacks.

The wrapper just ensures that any code that gets evaluated gets evaluated without chrome privileges. Accessing objects directly without this wrapper can allow for code to run with chrome privileges, which then lets that code do just about anything.

The purpose of the wrappers in general is to protect Privileged code when interacting with unprivileged code. The author of the unprivileged code might redefine a JavaScript object to do something malicious, like redefine the getter of a property to execute something bad as a side effect. When the privileged code tries to access the property it would execute the bad code as privileged code. The wrapper prevents this. This page describes the idea.
XPCSafeJSObject provide a wrapper for non-natively implemented JavaScript objects (i.e. not window, document, etc. but user defined objects.)
Edit: For how it's implemented, check out the source code (it's not loading completely for me at the moment.) Also search for XPCSafeJSObject on DXR for other relevant source files.

Develop Reference

JavaScript is the programming language of the Web.