Node streams, from what i could find1, delay all flowing of data until the end of the tick (typically process.nextTick, but which queue isn't important), and don't start pumping data synchronously on method calls.
Sadly, i could not find a guarantee for this behavior in the docs (hopefully just missed it), and can therefore hardly depend on it. Is this behavior in the public API? Does this extend to all streams using the API for stream implementers?
To elaborate, except for explicitly reading sync with stream.Readable.prototype.read, all mechanisms appear to not synchronously start pumping data:
multiple event handlers in a row:
import * as stream from 'node:stream';
const r = stream.Readable.from(['my\ndata', 'more\nstuff']);
r.on('data', data => { /* do something */ });
// oops, attaching the first event handler could have already
// sent out all data synchronously
r.on('data', data => { /* do more */ });
starting a pipe, before the event handlers exist:
import * as stream from 'node:stream';
import * as readline from 'node:readline';
const r = stream.Readable.from(['my\ndata', 'more\nstuff']);
const p = new stream.PassThrough();
const rl = readline.createInterface({
input: p,
output: null,
});
r.pipe(p);
// oops, starting the pipe could have already sent all the data,
// giving "line" events without any listener being attached yet
rl.on('line', line => { /* do something */ });
My hope is i missed a line in the docs, and that's that. Otherwise, it will be hard to tell, how much i can depend on this behavior. E.g. in the examples above, one could explicitly pause the stream first, and resume it in nextTick (or similar), but it would be considerably cleaner, if this behavior could be depended upon.
[1]: e.g. the entire added complexity and construct of keeping track of sync, or old posts asking for the behavior at the time
My first time writing a js library. The library is intended to execute, at specific times, functions in the file that required the library. Kind of like Angular executes user implemented hooks such as $onInit, except that, in my case, user can define an arbitrary number of functions to be called by my library. How can I implement that?
One way I have in mind is to define a registerFunction(name, function) method, which maps function names to implementations. But can user just give me an array of names and I automatically register the corresponding functions for them?
Unless you have a specific requirement that it do so, your module does not need to know the names of the functions it is provided. When your module invokes those functions, it will do so by acting on direct references to them rather than by using their names.
For example:
// my-module.js
module.exports = function callMyFunctions( functionList ) {
functionList.forEach( fn => fn() )
}
// main application
const myFunc1 = () => console.log('Function 1 executing')
const myFunc2 = () => console.log('Function 2 executing')
const moduleThatInvokesMyFunctions = require('./my-module.js')
// instruct the module to invoke my 2 cool functions
moduleThatInvokesMyFunctions([ myFunc1, myFunc2 ])
//> Function 1 executing
//> Function 2 executing
See that the caller provides direct function references to the module, which the module then uses -- without caring or even knowing what those functions are called. (Yes, you can obtain their names by inspecting the function references, but why bother?)
If you want a more in-depth answer or explanation, it would help to know more about your situation. What environment does your library target: browsers? nodejs? Electron? react-native?
The library is intended to execute, at specific times, functions in the file that required the library
The "at specific times" suggests to me something that is loosely event-based. So, depending on what platform you're targeting, you could actually use a real EventEmitter. In that case, you'd invent unique names for each of the times that a function should be invoked, and your module would then export a singleton emitter. Callers would then assign event handlers for each of the events they care about. For callers, that might look like this:
const lifecycleManager = require('./your-module.js')
lifecycleManager.on( 'boot', myBootHandler )
lifecycleManager.on( 'config-available', myConfigHandler )
// etc.
A cruder way to handle this would be for callers to provide a dictionary of functions:
const orchestrateJobs = require('./your-module.js')
orchestrateJobs({
'boot': myBootHandler,
'config-available': myConfigHandler
})
If you're not comfortable working with EventEmitters, this may be appealing. But going this route requires that you consider how to support other scenarios like callers wanting to remove a function, and late registration.
Quick sketch showing how to use apply with each function:
// my-module.js
module.exports = function callMyFunctions( functionList ) {
functionList.forEach( fn => fn.apply( thisValue, arrayOfArguments ) )
}
Note that this module still has no idea what names the caller has assigned to these functions. Within this scope, each routine bears the moniker "fn."
I get the sense you have some misconceptions about how execution works, and that's led you to believe that the parts of the program need to know the names of other parts of the program. But that's not how continuation-passing style works.
Since you're firing caller functions based on specific times, it's possible the event model might be a good fit. Here's a sketch of what that might look like:
// caller
const AlarmClock = require('./your-module.js')
function doRoosterCall( exactTime ) {
console.log('I am a rooster! Cock-a-doodle-doo!')
}
function soundCarHorn( exactTime ) {
console.log('Honk! Honk!')
}
AlarmClock.on('sunrise', doRoosterCall)
AlarmClock.on('leave-for-work', soundCarHorn)
// etc
To accomplish that, you might do something like...
// your-module.js
const EventEmitter = require('events')
const singletonClock = new EventEmitter()
function checkForEvents() {
const currentTime = new Date()
// check for sunrise, which we'll define as 6:00am +/- 10 seconds
if(nowIs('6:00am', 10 * 1000)) {
singletonClock.emit('sunrise', currentTime)
}
// check for "leave-for-work": 8:30am +/- 1 minute
if(nowIs('8:30am', 60 * 1000)) {
singletonClock.emit('leave-for-work', currentTime)
}
}
setInterval( checkForEvents, 1000 )
module.exports = singletonClock
(nowIs is some handwaving for time-comparisons. When doing cron-like work, you should assume your heartbeat function will almost never be fired when the time value is an exact match, and so you'll need something to provide "close enough" comparisons. I didn't provide an impl because (1) it seems like a peripheral concern here, and (2) I'm sure Momentjs, date-fns, or some other package provides something great so you won't need to implement it yourself.
I have an app and it uses webworkers to help do some computation with other workers. The error it throws is essentially complaining that the data it's expecting isn't there since the computations weren't computed.
The computations are performed locally with the webworkers running, but in prod I am pretty much out of ideas why it's not working.
One thing to note is that because I need the context of certain functions (defined outside of the worker context) imported in, I've followed the advice suggested here: ( Creating a web worker inside React ) and am stringing it in first and turning it into a blob first.
`
export const buildBlobUrlFromCode = (code) => {
// returns a DOMString that links context to the code within the blob
code = code.substring(code.indexOf("{") + 1, code.lastIndexOf("}"));
const blob = new Blob([code], { type: "application/javascript" });
return URL.createObjectURL(blob);
}
export class WebWorker {
constructor(worker) {
let code = worker.toString();
// const blobUrl = strDecoratedBuildBlobUrlFromCode(code);
const blobUrl = buildBlobUrlFromCode(code);
return new Worker(blobUrl);
}
}
and how I use the above code is:
const workerInstance = new WebWorker(MyWorker);
where MyWorker is a fn exported with onmessage defined and all the operations defined inside.
I added some logs, and essentially the code break when I start to kick the computation off with: workerInstance.postMessage([data]);
Also to note is that I'm using the react-create-app to build my app and I haven't ejected yet so the webpack configs are untouched.
Also -- the source files for these workers are completely blank when deployed, but are filled on local ( as in I can see the actual code in the worker.js files )
Heroku -- Broken:
Local -- Working:
We have been struggling with a similar issue and we found the following...
We are also using Webpack and importantly a production configuration which uglified the code. During this process we found that the onmessage function variable was being discarded as it was not being called anywhere else. This explained why no code was visible in the production build worker instance.
We also found another potential pitfall in that the uglification process was changing all variable names to single characters. This process would also render the onmessage function useless.
To fix it, instead of declaring the variable onmessage, assign onmessage to the global object (this)
this.onmessage = (event) => {}
This way the uglification did not discard the code and the function name was not altered.
I know this is late but hope it helps.
I want to do the following:
execute a Javascript file with v8
open a REPL which evaluates code in the exact same context as the code
Any variables or functions defined in the code file, for instance, should be available in the REPL. (I'll note this used to work many v8 versions ago, but I can't figure out how to get it working in current v8 (node 0.12 == v8 3.28.73).)
I use a simple class JSInterpreter which has an isolate and a persistent context object as member vars. They gets set up when the class is initialized, and bindings also happen at that time.
When it's time to interpret some code, I call this method:
Str JSInterpreter::InterpretJS (const Str &js)
{ v8::Isolate::Scope isolate_scope (isolate_);
v8::HandleScope handle_scope (isolate_);
// Restore the context established at init time;
// Have to make local version of persistent handle
v8::Local <v8::Context> context =
v8::Local <v8::Context>::New (isolate_, context_);
Context::Scope context_scope (context);
Handle <String> source = String::NewFromUtf8 (isolate_, js . utf8 ());
Handle <Script> script = Script::Compile (source);
Handle <Value> result = script -> Run ();
I want to call this method over & over again, and each time, I want the context to contain any accumulated state from earlier calls. So if the code file contains (only) var x = 5; at the REPL I should be able to type > x and see the result 5.
But the actual result is x is not defined.
It turns out that this code actually does work as expected. The problem was that I was using browserify before running the code, and the code (e.g. var x = 5;) was getting wrapped into a function scope.
I'm wondering if it's possible to sandbox JavaScript running in the browser to prevent access to features that are normally available to JavaScript code running in an HTML page.
For example, let's say I want to provide a JavaScript API for end users to let them define event handlers to be run when "interesting events" happen, but I don't want those users to access the properties and functions of the window object. Am I able to do this?
In the simplest case, let's say I want to prevent users calling alert. A couple of approaches I can think of are:
Redefine window.alert globally. I don't think this would be a valid approach because other code running in the page (i.e., stuff not authored by users in their event handlers) might want to use alert.
Send the event handler code to the server to process. I'm not sure that sending the code to the server to process is the right approach, because the event handlers need to run in the context of the page.
Perhaps a solution where the server processes the user defined function and then generates a callback to be executed on the client would work? Even if that approach works, are there better ways to solve this problem?
Google Caja is a source-to-source translator that "allows you to put untrusted third-party HTML and JavaScript inline in your page and still be secure."
Have a look at Douglas Crockford's ADsafe:
ADsafe makes it safe to put guest code (such as third party scripted advertising or widgets) on any web page. ADsafe defines a subset of JavaScript that is powerful enough to allow guest code to perform valuable interactions, while at the same time preventing malicious or accidental damage or intrusion. The ADsafe subset can be verified mechanically by tools like JSLint so that no human inspection is necessary to review guest code for safety. The ADsafe subset also enforces good coding practices, increasing the likelihood that guest code will run correctly.
You can see an example of how to use ADsafe by looking at the template.html and template.js files in the project's GitHub repository.
I created a sandboxing library called jsandbox that uses web workers to sandbox evaluated code. It also has an input method for explicitly giving sandboxed code data it wouldn't otherwise be able to get.
The following is an example of the API:
jsandbox
.eval({
code : "x=1;Math.round(Math.pow(input, ++x))",
input : 36.565010597564445,
callback: function(n) {
console.log("number: ", n); // number: 1337
}
}).eval({
code : "][];.]\\ (*# ($(! ~",
onerror: function(ex) {
console.log("syntax error: ", ex); // syntax error: [error object]
}
}).eval({
code : '"foo"+input',
input : "bar",
callback: function(str) {
console.log("string: ", str); // string: foobar
}
}).eval({
code : "({q:1, w:2})",
callback: function(obj) {
console.log("object: ", obj); // object: object q=1 w=2
}
}).eval({
code : "[1, 2, 3].concat(input)",
input : [4, 5, 6],
callback: function(arr) {
console.log("array: ", arr); // array: [1, 2, 3, 4, 5, 6]
}
}).eval({
code : "function x(z){this.y=z;};new x(input)",
input : 4,
callback: function(x) {
console.log("new x: ", x); // new x: object y=4
}
});
An improved version of RyanOHara's web workers sandbox code, in a single file (no extra eval.js file is necessary).
function safeEval(untrustedCode)
{
return new Promise(function (resolve, reject)
{
var blobURL = URL.createObjectURL(new Blob([
"(",
function ()
{
var _postMessage = postMessage;
var _addEventListener = addEventListener;
(function (obj)
{
"use strict";
var current = obj;
var keepProperties =
[
// Required
'Object', 'Function', 'Infinity', 'NaN', 'undefined', 'caches', 'TEMPORARY', 'PERSISTENT',
// Optional, but trivial to get back
'Array', 'Boolean', 'Number', 'String', 'Symbol',
// Optional
'Map', 'Math', 'Set',
];
do
{
Object.getOwnPropertyNames(current).forEach(function (name)
{
if (keepProperties.indexOf(name) === -1)
{
delete current[name];
}
});
current = Object.getPrototypeOf(current);
}
while (current !== Object.prototype)
;
})(this);
_addEventListener("message", function (e)
{
var f = new Function("", "return (" + e.data + "\n);");
_postMessage(f());
});
}.toString(),
")()"],
{type: "application/javascript"}));
var worker = new Worker(blobURL);
URL.revokeObjectURL(blobURL);
worker.onmessage = function (evt)
{
worker.terminate();
resolve(evt.data);
};
worker.onerror = function (evt)
{
reject(new Error(evt.message));
};
worker.postMessage(untrustedCode);
setTimeout(function ()
{
worker.terminate();
reject(new Error('The worker timed out.'));
}, 1000);
});
}
Test it:
https://jsfiddle.net/kp0cq6yw/
var promise = safeEval("1+2+3");
promise.then(function (result) {
alert(result);
});
It should output 6 (tested in Chrome and Firefox).
As mentioned in other responces, it's enough to jail the code in a sandboxed iframe (without sending it to the server-side) and communicate with messages.
I would suggest to take a look at a small library I created mostly because of the need to providing some API to the untrusted code, just like as described in the question: there's an opportunity to export the particular set of functions right into the sandbox where the untrusted code runs. And there's also a demo which executes the code submitted by a user in a sandbox:
http://asvd.github.io/jailed/demos/web/console/
I think that js.js is worth mentioning here. It's a JavaScript interpreter written in JavaScript.
It's about 200 times slower than native JavaScript, but its nature makes it a perfect sandbox environment. Another drawback is its size – almost 600 KB, which may be acceptable for desktops in some cases, but not for mobile devices.
All the browser vendors and the HTML5 specification are working towards an actual sandbox property to allow sandboxed iframes -- but it's still limited to iframe granularity.
In general, no degree of regular expressions, etc. can safely sanitise arbitrary user provided JavaScript as it degenerates to the halting problem :-/
An ugly way, but maybe this works for you:
I took all the globals and redefined them in the sandbox scope, as well I added the strict mode so they can't get the global object using an anonymous function.
function construct(constructor, args) {
function F() {
return constructor.apply(this, args);
}
F.prototype = constructor.prototype;
return new F();
}
// Sanboxer
function sandboxcode(string, inject) {
"use strict";
var globals = [];
for (var i in window) {
// <--REMOVE THIS CONDITION
if (i != "console")
// REMOVE THIS CONDITION -->
globals.push(i);
}
globals.push('"use strict";\n'+string);
return construct(Function, globals).apply(inject ? inject : {});
}
sandboxcode('console.log( this, window, top , self, parent, this["jQuery"], (function(){return this;}()));');
// => Object {} undefined undefined undefined undefined undefined undefined
console.log("return of this", sandboxcode('return this;', {window:"sanboxed code"}));
// => Object {window: "sanboxed code"}
https://gist.github.com/alejandrolechuga/9381781
An independent JavaScript interpreter is more likely to yield a robust sandbox than a caged version of the built-in browser implementation.
Ryan has already mentioned js.js, but a more up-to-date project is JS-Interpreter. The documentation covers how to expose various functions to the interpreter, but its scope is otherwise very limited.
As of 2019, vm2 looks like the most popular and most regularly-updated solution to running JavaScript in Node.js. I'm not aware of a front-end solution.
With NISP you'll be able to do sandboxed evaluation.
Though the expression you write is not exactly JavaScript code, instead you'll write S-expressions. It is ideal for simple DSLs that doesn't demand extensive programming.
Suppose you have code to execute:
var sCode = "alert(document)";
Now, suppose you want to execute it in a sandbox:
new Function("window", "with(window){" + sCode + "}")({});
These two lines when executed will fail, because "alert" function is not available from the "sandbox"
And now you want to expose a member of window object with your functionality:
new Function("window", "with(window){" + sCode + "}")({
'alert':function(sString){document.title = sString}
});
Indeed you can add quotes escaping and make other polishing, but I guess the idea is clear.
Where is this user JavaScript code coming from?
There is not much you can do about a user embedding code into your page and then calling it from their browser (see Greasemonkey). It's just something browsers do.
However, if you store the script in a database, then retrieve it and eval() it, then you can clean up the script before it is run.
Examples of code that removes all window. and document. references:
eval(
unsafeUserScript
.replace(/\/\/.+\n|\/\*.*\*\/, '') // Clear all comments
.replace(/\s(window|document)\s*[\;\)\.]/, '') // Removes window. Or window; or window)
)
This tries to prevent the following from being executed (not tested):
window.location = 'http://example.com';
var w = window;
There are a lot of limitations you would have to apply to the unsafe user script. Unfortunately, there isn't any 'sandbox container' available for JavaScript.
I've been working on a simplistic JavaScript sandbox for letting users build applets for my site. Although I still face some challenges with allowing DOM access (parentNode just won't let me keep things secure =/), my approach was just to redefine the window object with some of its useful/harmless members, and then eval() the user code with this redefined window as the default scope.
My "core" code goes like this... (I'm not showing it entirely ;)
function Sandbox(parent){
this.scope = {
window: {
alert: function(str){
alert("Overriden Alert: " + str);
},
prompt: function(message, defaultValue){
return prompt("Overriden Prompt:" + message, defaultValue);
},
document: null,
.
.
.
.
}
};
this.execute = function(codestring){
// Here some code sanitizing, please
with (this.scope) {
with (window) {
eval(codestring);
}
}
};
}
So, I can instantiate a Sandbox and use its execute() function to get code running. Also, all new declared variables within eval'd code will ultimately bound to the execute() scope, so there will not be clashing names or messing with existing code.
Although global objects will still be accessible, those which should remain unknown to the sandboxed code must be defined as proxies in the Sandbox::scope object.
You can wrap the user's code in a function that redefines forbidden objects as parameters -- these would then be undefined when called:
(function (alert) {
alert ("uh oh!"); // User code
}) ();
Of course, clever attackers can get around this by inspecting the JavaScript DOM and finding a non-overridden object that contains a reference to the window.
Another idea is scanning the user's code using a tool like JSLint. Make sure it's set to have no preset variables (or: only variables you want), and then if any globals are set or accessed do not let the user's script be used. Again, it might be vulnerable to walking the DOM -- objects that the user can construct using literals might have implicit references to the window object that could be accessed to escape the sandbox.