javascript V8 optimisation and "leaking arguments"

javascript V8 optimisation and "leaking arguments" - javascript

I read in various places that it's advisable to be careful with the arguments object and that this is ok...
var i = arguments.length, args = new Array(i);
while (i--) args[i] = arguments[i];
But, is this ok or not?...
function makeArray (l) {
var i = l.length, array = new Array(i);
while (i--) array[i] = l[i];
return array;
};
//...
//EDIT: put the function on an object to better represent the actual case
var o = {};
o.f = function (callback) {
var args = makeArray (arguments);
callback.apply(args[0] = this, args);
};
What is meant by "leaking arguments" and how does it affect optimisation?
I am focused on V8, but I assume it applies to other compilers as well?
to prove it works...
function makeArray (l) {
var i = l.length, array = new Array(i);
while (i--) array[i] = l[i];
return array;
};
//...
//EDIT: put the function on an object to better represent the actual case
var o = {};
o.f = function (callback) {
var args = makeArray (arguments);
callback.apply(args[0] = this, args);
};
o.m = "Hello, ";
function test(f, n) {
alert(this.m + " " + n)
}
o.f(test, "it works...")

The problem with arguments is same as with local eval and with: they cause aliasing. Aliasing defeats all sorts of optimizations so even if you enabled optimization of these kind of functions you would probably end up just wasting time which is a problem with JITs because the time spent in the compiler is time not spent running code (although some steps in the optimization pipeline can be run in parallel).
Aliasing due to arguments leaking:
function bar(array) {
array[0] = 2;
}
function foo(a) {
a = 1;
bar(arguments);
// logs 2 even though a is local variable assigned to 1
console.log(a);
}
foo(1);
Note that strict mode eliminates this:
function bar(array) {
array[0] = 2;
}
function foo(a) {
"use strict";
a = 1;
bar(arguments);
// logs 1 as it should
console.log(a);
}
foo(1);
Strict mode however isn't optimized either and I don't know any reasonable explanation except that benchmarks don't use strict mode and strict mode is rarely used. That might change since many es6 features require strict mode, otoh arguments is not needed in es6 so...

This is a 2-fold problem:
You lose any and all possible optimizations and branch predictions
The arguments object is unpredictible.
You leak memory. Really bad!
Consider the following code (run it at your own risk!):
function a(){return arguments;}
x=a(document.getElementsByTagName('*'));
window._interval=setInterval(function(){
for(var i=0;i<1e6;i++)
{
x=a(x);
}
},5000);
*{font-family:sans-serif;}
<p>Warning! This may overheat your cpu and crash your browser really badly!</p>
<p>I'm not responsible for any software and hardware damages!</p>
<p><b>Run at your own risk!!!</b></p>
<p>If you ran this code, press F5 to stop or close the browser.</p>
<p>If you want, you can try to <button onclick="window.clearInterval(window._interval);window.x='';">stop it</button>. (the RAM may remain unchanged)</p>
And now watch your RAM go up, when you run that (it goes up at a rate of around 300-500MB every 5 seconds).
A naive implementation that simply passes the arguments around may cause these problems.
Not to mention that your code will (generally) be a tad slower.
Note that this:
function a(){return arguments;}
function b(arg){return arg;}
x=a(document.getElementsByTagName('*'));
window._interval=setInterval(function(){
for(var i=0;i<1e6;i++)
{
x=b(x);
}
},5000);
*{font-family:sans-serif;}
<p>This may be safe, but be careful!</p>
<p>I'm not responsible for any software and hardware damages!</p>
<p><b>Run at your own risk!!!</b></p>
<p>If you ran this code, press F5 to stop or close the browser.</p>
<p>If you want, you can try to <button onclick="window.clearInterval(window._interval);window.x='';">stop it</button>.</p>
Won't have the same effect as the code before.
This is because b() returns the same variable, instead of a new reference to a new arguments object.
This is a very important difference.

Related

transpiler battle: breaking out of nested function, with vs without throw

I have just finished writing "version 0" of my first (toy) transpiler. It works. It turns a string of "pseudo JavaScript" (JavaScript with an additional feature) into a string of runnable JavaScript. Now, I want to improve it.
The work area possibly most interesting for other SO users is this: The compiled code (i.e., output of my transpiler) does not heed a coding style recommendation as given in an accepted answer to some earlier SO question. If I would have at my hands a second transpiler where that coding style recommendation is heeded, I could make an informed decision regarding which branch is more promising to continue to develop on - I'd like to compare the 2 braches regarding performance, development time needed, amount of bugs, and so on, and decide based on that.
Let me tell you about the "additional JS feature" my transpiler deals with: "nested return". Consider closures / nested functions like so
function myOuterFunc(){
... code ...
function innerFunc(){
... code ...
}
... code ...
}
(note that above '...code...' is supposed to include every possible JS code including more nested function declarations, so myOuterFunc is not necessarily the direct parent of innerFunc)
In above situation, suppose you desire to return a result from myOuterFunc from somewhere inside - not necessarily directly inside - innerFunc
With "nested return" implemented, you could then write simply
return.myOuterFunc result
Here is an exmalpe of a (not-runnable) function using this feature and doing something meaningful
function multiDimensionalFind(isNeedle, haystack) {
// haystack is an array of arrays
// loop (recursively) through all ways of picking one element from each array in haystack
// feed the picked elements as array to isNeedle and return immediately when isNeedle gives true
// with those picked elements being the result, i.e. the 'found needle'
var LEVEL = haystack.length;
function inner(stack) {
var level = stack.length;
if (level >= LEVEL) {
if (isNeedle(stack)) return.multiDimensionalFind stack;
} else {
var arr = haystack[level];
for (var i = 0; i < arr.length; i++) {
inner(stack.concat([arr[i]]));
}
}
}
inner([]);
return 'not found'
}
And here is the (runnable) code my transpiler automatically produces from that (comments were added/remove later, obviously), followed by some code testing if that function does what it claims to do (and it does, as you can convince yourself.)
///////////// the function /////////////////
function multiDimensionalFind(isNeedle, haystack) {
try {
var LEVEL = haystack.length;
function inner(stack) {
var level = stack.length;
if (level >= LEVEL) {
if (isNeedle(stack)) throw stack;
} else {
var arr = haystack[level];
for (var i = 0; i < arr.length; i++) {
inner(stack.concat([arr[i]]));
}
}
}
inner([]);
return 'not found'
} catch(e){
// make sure "genuine" errors don't get destroyed or mishandled
if (e instanceof Error) throw e; else return e;
}
}
////////////////// test it //////////////////
content = document.getElementById('content');
function log2console(){
var digits = [0,1];
var haystack = [digits,digits,digits,digits,digits];
var str = '';
function isNeedle(stack){
str = str + ', ' + stack.join('')
return false;
}
multiDimensionalFind(isNeedle, haystack);
content.textContent = str;
}
function find71529(){ // second button
var digits = [0,1,2,3,4,5,6,7,8,9]
var haystack = [digits,digits,digits,digits,digits]
function isNeedle(stack){
return stack.reduce(function(b,i){ return 10*b+i; }, 0) === 71529
// returns true iff the stack contains [7,1,5,2,9]
}
content.textContent = multiDimensionalFind(
isNeedle, haystack
).join('_')
}
<button onclick='log2console()'>print binary numbers with 5 digits</button>
<br>
<button onclick='find71529()'>find something is 5d space</button>
<div id='content'></div>
You can play around with my transpiler in this fiddle here. I'm using the esprima library, the escodegen.js library on top of esprima, a teeny tiny work in progress abstract syntax tree generation library of my own (see the script tags in the fiddle). The code which is not library, and not UI code, i.e. the "real meat" of the transpiler has less than 100 lines (see function transpile). So this might be a lot less complicated than you thought.
I can't remember where I had seen the style recommendation, but I am certain that it was actually in multiple places. If you know or come across one such question, I invite you to be so kind to put the link into a comment under the question, I will mark useful. So far, there is one link, thank you Barmar.
You might ask why I even bothered to write a "non-compliant" transpiler first, and did not go for the "compliant" version straight away. That has to do with the estimated amount of work. I estimate the amount of work to be much larger for the "compliant" version. So much so, that it didn't really seem worthwhile to embark on such an endeavor. I would very much like to know whether this assessment of the amount of work is correct or incorrect. Thus the question. Please do not insinuate a rhetorical, or even a dishonest motive for the question; however weird it might sound to some, it really is the case that I'd like to be proven wrong, so please be so kind not to assume I'm "just saying that" for whatever reason, you'd be doing me an injustice. This is, of all the questions I asked on SO so far, by far the one I've put the most work into. And, if you ask me, it is by far the best question I have ever asked here.
Apart from someone assisting me in writing the "compliant" version of the transpiler, I'm also interested (albeit to a lesser degree) in anything objectively demonstrable which stands a chance to convince me that the "non-compliant" way is the wrong way. Speed tests (with links to jsperf), reproducible bug reports, things of that sort.
I should mention the speed tests I have done myself so far:
first test, second test
loosely related question

The better way really is to use return (if you can't completely refactor the tower). It makes it clear what you're doing and when you're returning a value from the intermediate functions; you can tell by looking at those functions where the result may be provided. In contrast, using throw is invisible in those intermediate layers; it magically bypassing them in a way designed for error conditions.
If you don't want to do that, I don't think you have a reasonable alternative other than throw. I wondered about going down the route of generator functions or similar, but I think that would just complicate things more, not less.
Using return doesn't markedly complicate the example code, and does (to my eye) make it clearer what's going on and when results are potentially expected.
function wrapper(){
function first(){
function second(){
function third(){
doStuff4();
some loop {
var result = ...
if (something) return result;
}
}
doStuff2();
let result = third();
if (result) {
return result;
}
doStuff3();
return third();
}
doStuff1();
return second();
}
return first() || "not found";
}
(In the above, result is tested for truthiness; substitute something else if appropriate.)

Ok, here is another approach with use of all async power of JavaScript... So basically I've recreated your nested functions but with use of Promise/await technique. You will get result only once, the first time you'll resolve it from any level of nested functions. Everything else will be GC. Check it out:
// Create async function
(async () => {
const fnStack = (val) => {
return new Promise((resolve, reject) => {
// Worker functions.
// Could be async!
const doStuff1 = (val) => val + 1;
const doStuff2 = (val) => val * 2;
const doStuff3 = (val) => val * -1; // This will not affect result
const doStuff4 = (val) => val + 1000;
// Nested hell
function first() {
function second() {
function third() {
val = doStuff4(val);
// Some loop
for(let i = 0; i < 1000; i++) {
if(i === 500) {
// Here we got our result
// Resolve it!
resolve(val);
}
}
}
val = doStuff2(val);
third();
// Below code will not affect
// resolved result in third() above
val = doStuff3(val);
third();
}
val = doStuff1(val);
second();
}
//
first();
});
}
// Run and get value
const val = await fnStack(5);
// We get our result ones
console.log(val);
})();

I think you should use arrays;
const funcs = [first, second, third];
for(let i = 0; i < funcs.length; ++i){
const result = funcs[i]();
if (result) break;
}
You should return only from function that has the result.

Sometime later, when I get around to it, I'll add instructions here on how to use my abstract syntax tree generation library which I used for my transpiler, maybe even start a little towards the other version, maybe explaining in more detail what the things are which make me think that it is more work.
old version follows (will be deleted soon)
If the feature is used multiple times we'll need something like this:
function NestedThrowee(funcName, value){
this.funcName = funcName;
this.value = value;
}
return.someFunctionName someReturnValue (before compilation) will give (after compliation) something like
var toBeThrown = new NestedThrowee("someFunctionName", someReturnValue);
And inside the generated catch block
if (e instanceof Error){
throw e; // re-throw "genuine" Error
} else {
if (e instance of NestedThrowee){
if (e.funcName === ... the name of the function to which
this catch block here belongs ...) return e.value;
throw new Error('something happened which mathheadinclouds deemed impossible');
}
In the general case, it is necessary to wrap the result (or 'throwee') like shown above, because there could be multiple, possibly nested "nested returns", and we have to take care that the catch phrase and caught object of type NestedThrowee match (by function name).

Performance implication in declaring variables JavaScript

Are there any performance(or otherwise) implications in doing this;
// ...
const greeting = `Hello, ${name}!`;
return greeting;
in comparison to doing just this;
// ...
return `Hello, ${name}!`;

Yes, assigning a value to a variable name and then returning that variable takes slightly more effort than simply returning the value. For a mini performance test, see:
(warning: the following will block your browser for a bit, depending on your specs)
// references to "name" removed to provide a more minimal test:
const p0 = performance.now();
for (let i = 0; i < 1e9; i++) {
(() => {
const greeting = `Hello!`;
return greeting;
})();
}
const p1 = performance.now();
for (let i = 0; i < 1e9; i++) {
(() => {
return `Hello!`;
})();
}
const p2 = performance.now();
console.log(p1 - p0);
console.log(p2 - p1);
The difference is quite small, but it's consistently there, at least in V8 - the overhead of the function call mostly overshadows it.
That said, this really sounds like premature optimization - it's probably better to aim for code readability and maintainability, and then fix performance issues if and when they pop up. If declaring a name for the returned value makes the code easier to read, it's probably a good idea to do so, even if it makes the script take a (completely insignificant) longer time to run.

Javascript log attribute access and function calls [duplicate]

Can I override the behavior of the Function object so that I can inject behavior prior t every function call, and then carry on as normal? Specifically, (though the general idea is intriguing in itself) can I log to the console every function call without having to insert console.log statements everywhere? And then the normal behavior goes on?
I do recognize that this will likely have significant performance problems; I have no intention of having this run typically, even in my development environment. But if it works it seems an elegant solution to get a 1000 meter view on the running code. And I suspect that the answer will show me something deeper about javascript.

The obvious answer is something like the following:
var origCall = Function.prototype.call;
Function.prototype.call = function (thisArg) {
console.log("calling a function");
var args = Array.prototype.slice.call(arguments, 1);
origCall.apply(thisArg, args);
};
But this actually immediately enters an infinite loop, because the very act of calling console.log executes a function call, which calls console.log, which executes a function call, which calls console.log, which...
Point being, I'm not sure this is possible.

Intercepting function calls
Many here have tried to override .call. Some have failed, some have succeeded.
I'm responding to this old question, as it has been brought up at my workplace, with this post being used as reference.
There are only two function-call related functions available for us to modify: .call and .apply. I will demonstrate a successful override of both.
TL;DR: What OP is asking is not possible. Some of the success-reports in the answers are due to the console calling .call internally right before evaluation, not because of the call we want to intercept.
Overriding Function.prototype.call
This appears to be the first idea people come up with. Some have been more successful than others, but here is an implementation that works:
// Store the original
var origCall = Function.prototype.call;
Function.prototype.call = function () {
// If console.log is allowed to stringify by itself, it will
// call .call 9 gajillion times. Therefore, lets do it by ourselves.
console.log("Calling",
Function.prototype.toString.apply(this, []),
"with:",
Array.prototype.slice.apply(arguments, [1]).toString()
);
// A trace, for fun
console.trace.apply(console, []);
// The call. Apply is the only way we can pass all arguments, so don't touch that!
origCall.apply(this, arguments);
};
This successfully intercepts Function.prototype.call
Lets take it for a spin, shall we?
// Some tests
console.log("1"); // Does not show up
console.log.apply(console,["2"]); // Does not show up
console.log.call(console, "3"); // BINGO!
It is important that this is not run from a console. The various browsers have all sorts of console tools that call .call themselves a lot, including once for every input, which might confuse a user in the moment. Another mistake is to just console.log arguments, which goes through the console api for stringification, which in turn cause an infinite loop.
Overriding Function.prototype.apply as well
Well, what about apply then? They're the only magic calling functions we have, so lets try that as well. Here goes a version that catches both:
// Store apply and call
var origApply = Function.prototype.apply;
var origCall = Function.prototype.call;
// We need to be able to apply the original functions, so we need
// to restore the apply locally on both, including the apply itself.
origApply.apply = origApply;
origCall.apply = origApply;
// Some utility functions we want to work
Function.prototype.toString.apply = origApply;
Array.prototype.slice.apply = origApply;
console.trace.apply = origApply;
function logCall(t, a) {
// If console.log is allowed to stringify by itself, it will
// call .call 9 gajillion times. Therefore, do it ourselves.
console.log("Calling",
Function.prototype.toString.apply(t, []),
"with:",
Array.prototype.slice.apply(a, [1]).toString()
);
console.trace.apply(console, []);
}
Function.prototype.call = function () {
logCall(this, arguments);
origCall.apply(this, arguments);
};
Function.prototype.apply = function () {
logCall(this, arguments);
origApply.apply(this, arguments);
}
... And lets try it out!
// Some tests
console.log("1"); // Passes by unseen
console.log.apply(console,["2"]); // Caught
console.log.call(console, "3"); // Caught
As you can see, the calling parenthesis go unnoticed.
Conclusion
Fortunately, calling parenthesis cannot be intercepted from JavaScript. But even if .call would intercept the parenthesis operator on function objects, how would we call the original without causing an infinite loop?
The only thing overriding .call/.apply does is to intercept explicit calls to those prototype functions. If the console is used with that hack in place, there will be lots and lots of spam. One must furthermore be very careful if it is used, as using the console API can quickly cause an infinite loop (console.log will use .call internally if one gives it an non-string).

I am getting SOME results and no page crashes with the following :
(function () {
var
origCall = Function.prototype.call,
log = document.getElementById ('call_log');
// Override call only if call_log element is present
log && (Function.prototype.call = function (self) {
var r = (typeof self === 'string' ? '"' + self + '"' : self) + '.' + this + ' (';
for (var i = 1; i < arguments.length; i++) r += (i > 1 ? ', ' : '') + arguments[i];
log.innerHTML += r + ')<br/>';
this.apply (self, Array.prototype.slice.apply (arguments, [1]));
});
}) ();
Only tested in Chrome version 9.xxx.
It is certainly not logging all function calls, but it is logging some!
I suspect only actual calls to 'call' intself are being processed

Only a quick test, but it seems to work for me.
It may not be useful this way, but I'm basically restoring the prototype whilst in my replacement's body and then "unrestoring" it before exiting.
This example simply logs all function calls - though there may be some fatal flaw I've yet to detect; doing this over a coffee break
implementation
callLog = [];
/* set up an override for the Function call prototype
* #param func the new function wrapper
*/
function registerOverride(func) {
oldCall = Function.prototype.call;
Function.prototype.call = func;
}
/* restore you to your regular programming
*/
function removeOverride() {
Function.prototype.call = oldCall;
}
/* a simple example override
* nb: if you use this from the node.js REPL you'll get a lot of buffer spam
* as every keypress is processed through a function
* Any useful logging would ideally compact these calls
*/
function myCall() {
// first restore the normal call functionality
Function.prototype.call = oldCall;
// gather the data we wish to log
var entry = {this:this, name:this.name, args:{}};
for (var key in arguments) {
if (arguments.hasOwnProperty(key)) {
entry.args[key] = arguments[key];
}
}
callLog.push(entry);
// call the original (I may be doing this part naughtily, not a js guru)
this(arguments);
// put our override back in power
Function.prototype.call = myCall;
}
usage
I've had some issues including calls to this in one big paste, so here's what I was typing into the REPL in order to test the above functions:
/* example usage
* (only tested through the node.js REPL)
*/
registerOverride(myCall);
console.log("hello, world!");
removeOverride(myCall);
console.log(callLog);

You can override Function.prototype.call, just make sure to only apply functions within your override.
window.callLog = [];
Function.prototype.call = function() {
Array.prototype.push.apply(window.callLog, [[this, arguments]]);
return this.apply(arguments[0], Array.prototype.slice.apply(arguments,[1]));
};

I found it easiest to instrument the file, using an automatic process. I built this little tool to make it easier for myself. Perhaps somebody else will find it useful. It's basically awk, but easier for a Javascript programmer to use.
// This tool reads a file and builds a buffer of say ten lines.
// When a line falls off the end of the buffer, it gets written to the output file.
// When a line is read from the input file, it gets written to the first line of the buffer.
// After each occurrence of a line being read from the input file and/or written to the output
// file, a routine is given control. The routine has the option of operating on the buffer.
// It can insert a line before or after a line that is there, based on the lines surrounding.
//
// The immediate case is that if I have a set of lines like this:
//
// getNum: function (a, c) {
// console.log(`getNum: function (a, c) {`);
// console.log(`arguments.callee = ${arguments.callee.toString().substr(0,100)}`);
// console.log(`arguments.length = ${arguments.length}`);
// for (var i = 0; i < arguments.length; i++) { console.log(`arguments[${i}] = ${arguments[i] ? arguments[i].toString().substr(0,100) : 'falsey'}`); }
// var d = b.isStrNum(a) ? (c && b.isString(c) ? RegExp(c) : b.getNumRegx).exec(a) : null;
// return d ? d[0] : null
// },
// compareNums: function (a, c, d) {
// console.log(`arguments.callee = ${arguments.callee.toString().substr(0,100)}`);
//
// I want to change that to a set of lines like this:
//
// getNum: function (a, c) {
// console.log(`getNum: function (a, c) {`);
// console.log(`arguments.callee = ${arguments.callee.toString().substr(0,100)}`);
// console.log(`arguments.length = ${arguments.length}`);
// for (var i = 0; i < arguments.length; i++) { console.log(`arguments[${i}] = ${arguments[i] ? arguments[i].toString().substr(0,100) : 'falsey'}`); }
// var d = b.isStrNum(a) ? (c && b.isString(c) ? RegExp(c) : b.getNumRegx).exec(a) : null;
// return d ? d[0] : null
// },
// compareNums: function (a, c, d) {
// console.log(`compareNums: function (a, c, d) {`);
// console.log(`arguments.callee = ${arguments.callee.toString().substr(0,100)}`);
//
// We are trying to figure out how a set of functions work, and I want each function to report
// its name when we enter it.
//
// To save time, options and the function that is called on each cycle appear at the beginning
// of this file. Ideally, they would be --something options on the command line.
const readline = require('readline');
//------------------------------------------------------------------------------------------------
// Here are the things that would properly be options on the command line. Put here for
// speed of building the tool.
const frameSize = 10;
const shouldReportFrame = false;
function reportFrame() {
for (i = frame.length - 1; i >= 0; i--) {
console.error(`${i}. ${frame[i]}`); // Using the error stream because the stdout stream may have been coopted.
}
}
function processFrame() {
// console.log(`******** ${frame[0]}`);
// if (frame[0].search('console.log(\`arguments.callee = \$\{arguments.callee.toString().substr(0,100)\}\`);') !== -1) {
// if (frame[0].search('arguments.callee') !== -1) {
// if (frame[0].search(/console.log\(`arguments.callee = \$\{arguments.callee.toString\(\).substr\(0,100\)\}`\);/) !== -1) {
var matchArray = frame[0].match(/([ \t]*)console.log\(`arguments.callee = \$\{arguments.callee.toString\(\).substr\(0,100\)\}`\);/);
if (matchArray) {
// console.log('******** Matched');
frame.splice(1, 0, `${matchArray[1]}console.log('${frame[1]}');`);
}
}
//------------------------------------------------------------------------------------------------
var i;
var frame = [];
const rl = readline.createInterface({
input: process.stdin
});
rl.on('line', line => {
if (frame.length > frameSize - 1) {
for (i = frame.length - 1; i > frameSize - 2; i--) {
process.stdout.write(`${frame[i]}\n`);
}
}
frame.splice(frameSize - 1, frame.length - frameSize + 1);
frame.splice(0, 0, line);
if (shouldReportFrame) reportFrame();
processFrame();
// process.stdout.write(`${line}\n`); // readline gives us the line with the newline stripped off
});
rl.on('close', () => {
for (i = frame.length - 1; i > -1; i--) {
process.stdout.write(`${frame[i]}\n`);
}
});
// Notes
//
// We are not going to control the writing to the output stream. In particular, we are not
// going to listen for drain events. Nodejs' buffering may get overwhelmed.
//

Can I override the Javascript Function object to log all function calls?

Can I override the behavior of the Function object so that I can inject behavior prior t every function call, and then carry on as normal? Specifically, (though the general idea is intriguing in itself) can I log to the console every function call without having to insert console.log statements everywhere? And then the normal behavior goes on?
I do recognize that this will likely have significant performance problems; I have no intention of having this run typically, even in my development environment. But if it works it seems an elegant solution to get a 1000 meter view on the running code. And I suspect that the answer will show me something deeper about javascript.

The obvious answer is something like the following:
var origCall = Function.prototype.call;
Function.prototype.call = function (thisArg) {
console.log("calling a function");
var args = Array.prototype.slice.call(arguments, 1);
origCall.apply(thisArg, args);
};
But this actually immediately enters an infinite loop, because the very act of calling console.log executes a function call, which calls console.log, which executes a function call, which calls console.log, which...
Point being, I'm not sure this is possible.

Intercepting function calls
Many here have tried to override .call. Some have failed, some have succeeded.
I'm responding to this old question, as it has been brought up at my workplace, with this post being used as reference.
There are only two function-call related functions available for us to modify: .call and .apply. I will demonstrate a successful override of both.
TL;DR: What OP is asking is not possible. Some of the success-reports in the answers are due to the console calling .call internally right before evaluation, not because of the call we want to intercept.
Overriding Function.prototype.call
This appears to be the first idea people come up with. Some have been more successful than others, but here is an implementation that works:
// Store the original
var origCall = Function.prototype.call;
Function.prototype.call = function () {
// If console.log is allowed to stringify by itself, it will
// call .call 9 gajillion times. Therefore, lets do it by ourselves.
console.log("Calling",
Function.prototype.toString.apply(this, []),
"with:",
Array.prototype.slice.apply(arguments, [1]).toString()
);
// A trace, for fun
console.trace.apply(console, []);
// The call. Apply is the only way we can pass all arguments, so don't touch that!
origCall.apply(this, arguments);
};
This successfully intercepts Function.prototype.call
Lets take it for a spin, shall we?
// Some tests
console.log("1"); // Does not show up
console.log.apply(console,["2"]); // Does not show up
console.log.call(console, "3"); // BINGO!
It is important that this is not run from a console. The various browsers have all sorts of console tools that call .call themselves a lot, including once for every input, which might confuse a user in the moment. Another mistake is to just console.log arguments, which goes through the console api for stringification, which in turn cause an infinite loop.
Overriding Function.prototype.apply as well
Well, what about apply then? They're the only magic calling functions we have, so lets try that as well. Here goes a version that catches both:
// Store apply and call
var origApply = Function.prototype.apply;
var origCall = Function.prototype.call;
// We need to be able to apply the original functions, so we need
// to restore the apply locally on both, including the apply itself.
origApply.apply = origApply;
origCall.apply = origApply;
// Some utility functions we want to work
Function.prototype.toString.apply = origApply;
Array.prototype.slice.apply = origApply;
console.trace.apply = origApply;
function logCall(t, a) {
// If console.log is allowed to stringify by itself, it will
// call .call 9 gajillion times. Therefore, do it ourselves.
console.log("Calling",
Function.prototype.toString.apply(t, []),
"with:",
Array.prototype.slice.apply(a, [1]).toString()
);
console.trace.apply(console, []);
}
Function.prototype.call = function () {
logCall(this, arguments);
origCall.apply(this, arguments);
};
Function.prototype.apply = function () {
logCall(this, arguments);
origApply.apply(this, arguments);
}
... And lets try it out!
// Some tests
console.log("1"); // Passes by unseen
console.log.apply(console,["2"]); // Caught
console.log.call(console, "3"); // Caught
As you can see, the calling parenthesis go unnoticed.
Conclusion
Fortunately, calling parenthesis cannot be intercepted from JavaScript. But even if .call would intercept the parenthesis operator on function objects, how would we call the original without causing an infinite loop?
The only thing overriding .call/.apply does is to intercept explicit calls to those prototype functions. If the console is used with that hack in place, there will be lots and lots of spam. One must furthermore be very careful if it is used, as using the console API can quickly cause an infinite loop (console.log will use .call internally if one gives it an non-string).

I am getting SOME results and no page crashes with the following :
(function () {
var
origCall = Function.prototype.call,
log = document.getElementById ('call_log');
// Override call only if call_log element is present
log && (Function.prototype.call = function (self) {
var r = (typeof self === 'string' ? '"' + self + '"' : self) + '.' + this + ' (';
for (var i = 1; i < arguments.length; i++) r += (i > 1 ? ', ' : '') + arguments[i];
log.innerHTML += r + ')<br/>';
this.apply (self, Array.prototype.slice.apply (arguments, [1]));
});
}) ();
Only tested in Chrome version 9.xxx.
It is certainly not logging all function calls, but it is logging some!
I suspect only actual calls to 'call' intself are being processed

Only a quick test, but it seems to work for me.
It may not be useful this way, but I'm basically restoring the prototype whilst in my replacement's body and then "unrestoring" it before exiting.
This example simply logs all function calls - though there may be some fatal flaw I've yet to detect; doing this over a coffee break
implementation
callLog = [];
/* set up an override for the Function call prototype
* #param func the new function wrapper
*/
function registerOverride(func) {
oldCall = Function.prototype.call;
Function.prototype.call = func;
}
/* restore you to your regular programming
*/
function removeOverride() {
Function.prototype.call = oldCall;
}
/* a simple example override
* nb: if you use this from the node.js REPL you'll get a lot of buffer spam
* as every keypress is processed through a function
* Any useful logging would ideally compact these calls
*/
function myCall() {
// first restore the normal call functionality
Function.prototype.call = oldCall;
// gather the data we wish to log
var entry = {this:this, name:this.name, args:{}};
for (var key in arguments) {
if (arguments.hasOwnProperty(key)) {
entry.args[key] = arguments[key];
}
}
callLog.push(entry);
// call the original (I may be doing this part naughtily, not a js guru)
this(arguments);
// put our override back in power
Function.prototype.call = myCall;
}
usage
I've had some issues including calls to this in one big paste, so here's what I was typing into the REPL in order to test the above functions:
/* example usage
* (only tested through the node.js REPL)
*/
registerOverride(myCall);
console.log("hello, world!");
removeOverride(myCall);
console.log(callLog);

You can override Function.prototype.call, just make sure to only apply functions within your override.
window.callLog = [];
Function.prototype.call = function() {
Array.prototype.push.apply(window.callLog, [[this, arguments]]);
return this.apply(arguments[0], Array.prototype.slice.apply(arguments,[1]));
};

I found it easiest to instrument the file, using an automatic process. I built this little tool to make it easier for myself. Perhaps somebody else will find it useful. It's basically awk, but easier for a Javascript programmer to use.
// This tool reads a file and builds a buffer of say ten lines.
// When a line falls off the end of the buffer, it gets written to the output file.
// When a line is read from the input file, it gets written to the first line of the buffer.
// After each occurrence of a line being read from the input file and/or written to the output
// file, a routine is given control. The routine has the option of operating on the buffer.
// It can insert a line before or after a line that is there, based on the lines surrounding.
//
// The immediate case is that if I have a set of lines like this:
//
// getNum: function (a, c) {
// console.log(`getNum: function (a, c) {`);
// console.log(`arguments.callee = ${arguments.callee.toString().substr(0,100)}`);
// console.log(`arguments.length = ${arguments.length}`);
// for (var i = 0; i < arguments.length; i++) { console.log(`arguments[${i}] = ${arguments[i] ? arguments[i].toString().substr(0,100) : 'falsey'}`); }
// var d = b.isStrNum(a) ? (c && b.isString(c) ? RegExp(c) : b.getNumRegx).exec(a) : null;
// return d ? d[0] : null
// },
// compareNums: function (a, c, d) {
// console.log(`arguments.callee = ${arguments.callee.toString().substr(0,100)}`);
//
// I want to change that to a set of lines like this:
//
// getNum: function (a, c) {
// console.log(`getNum: function (a, c) {`);
// console.log(`arguments.callee = ${arguments.callee.toString().substr(0,100)}`);
// console.log(`arguments.length = ${arguments.length}`);
// for (var i = 0; i < arguments.length; i++) { console.log(`arguments[${i}] = ${arguments[i] ? arguments[i].toString().substr(0,100) : 'falsey'}`); }
// var d = b.isStrNum(a) ? (c && b.isString(c) ? RegExp(c) : b.getNumRegx).exec(a) : null;
// return d ? d[0] : null
// },
// compareNums: function (a, c, d) {
// console.log(`compareNums: function (a, c, d) {`);
// console.log(`arguments.callee = ${arguments.callee.toString().substr(0,100)}`);
//
// We are trying to figure out how a set of functions work, and I want each function to report
// its name when we enter it.
//
// To save time, options and the function that is called on each cycle appear at the beginning
// of this file. Ideally, they would be --something options on the command line.
const readline = require('readline');
//------------------------------------------------------------------------------------------------
// Here are the things that would properly be options on the command line. Put here for
// speed of building the tool.
const frameSize = 10;
const shouldReportFrame = false;
function reportFrame() {
for (i = frame.length - 1; i >= 0; i--) {
console.error(`${i}. ${frame[i]}`); // Using the error stream because the stdout stream may have been coopted.
}
}
function processFrame() {
// console.log(`******** ${frame[0]}`);
// if (frame[0].search('console.log(\`arguments.callee = \$\{arguments.callee.toString().substr(0,100)\}\`);') !== -1) {
// if (frame[0].search('arguments.callee') !== -1) {
// if (frame[0].search(/console.log\(`arguments.callee = \$\{arguments.callee.toString\(\).substr\(0,100\)\}`\);/) !== -1) {
var matchArray = frame[0].match(/([ \t]*)console.log\(`arguments.callee = \$\{arguments.callee.toString\(\).substr\(0,100\)\}`\);/);
if (matchArray) {
// console.log('******** Matched');
frame.splice(1, 0, `${matchArray[1]}console.log('${frame[1]}');`);
}
}
//------------------------------------------------------------------------------------------------
var i;
var frame = [];
const rl = readline.createInterface({
input: process.stdin
});
rl.on('line', line => {
if (frame.length > frameSize - 1) {
for (i = frame.length - 1; i > frameSize - 2; i--) {
process.stdout.write(`${frame[i]}\n`);
}
}
frame.splice(frameSize - 1, frame.length - frameSize + 1);
frame.splice(0, 0, line);
if (shouldReportFrame) reportFrame();
processFrame();
// process.stdout.write(`${line}\n`); // readline gives us the line with the newline stripped off
});
rl.on('close', () => {
for (i = frame.length - 1; i > -1; i--) {
process.stdout.write(`${frame[i]}\n`);
}
});
// Notes
//
// We are not going to control the writing to the output stream. In particular, we are not
// going to listen for drain events. Nodejs' buffering may get overwhelmed.
//

How to simulate JavaScript yield?

One of the new mechanisms available in JavaScript 1.7 is yield, useful for generators and iterators.
This is currently supported in Mozilla browsers only (that I'm aware of). What are some of the ways to simulate this behavior in browsers where it is not available?

Well you could always write an outer function that initializes variables in a closure and then returns an object that does whatever work you want.
function fakeGenerator(x) {
var i = 0;
return {
next: function() {
return i < x ? (i += 1) : x;
}
};
}
Now you can write:
var gen = fakeGenerator(10);
and then call gen.next() over and over again. It'd be tricky to simulate the "finally" behavior of the "close()" method on real generators, but you might be able to get somewhere close.

Similar to Pointy's answer, but with a hasNext method:
MyList.prototype.iterator = function() { //MyList is the class you want to add an iterator to
var index=0;
var thisRef = this;
return {
hasNext: function() {
return index < thisRef._internalList.length;
},
next: function() {
return thisRef._internalList[index++];
}
};
};
The hasNext method let's you loop like:
var iter = myList.iterator() //myList is a populated instance of MyList
while (iter.hasNext())
{
var current = iter.next();
//do something with current
}

For a non-trivial generator function, you will want to use some sort of tool to translate your code to an ES3 equivalent so that it can run in any modern browser. I recommend trying out Traceur, which can roughly be described as an ES6-to-ES3 source translator. Because generators are a language feature destined for ES6, Traceur will be able to translate them for you.
Traceur provides a demo page where you can type ES6 code and see the ES3 generated on the fly. If you enter something as simple as:
// Note that this declaration includes an asterisk, as specified by current ES6
// proposals. As of version 16, Firefox's built-in support for generator
// functions does not allow the asterisk.
function* foo() {
var n = 0;
if (n < 10) {
n++;
yield n;
}
}
for (var n of foo()) {
console.log(n);
}
you will see that the equivalent ES3 code is non-trivial, and it requires traceur.runtime to be included so that the code runs correctly in a browser. The runtime is defined in http://traceur-compiler.googlecode.com/git/src/runtime/runtime.js, which is currently 14K (unminified). This is a non-trivial amount of code, though likely much of it can be optimized away using the Closure Compiler.
Note there is also a bug filed to provide an option to inline the required functions from the traceur.runtime namespace, which would eliminate the need to include runtime.js altogether: https://code.google.com/p/traceur-compiler/issues/detail?id=119.

Without some sort of compiler or preprocessor… No.
The closest you can come is something like this:
function doStuff() {
var result = { };
function firstStuf() { ...; result.next = secondStuff; return 42; };
function secondStuf() { ...; result.next = thirdStuff; return 16; };
function thirdStuf() { ...; result.next = null; return 7; };
result.next = firstStuff;
return result;
}
But, well… That's pretty crappy, and really isn't much of a substitute.

I have started a little project that tries to do this with some callback trickery. Since it's impossible to create real coroutines in "standard" JavaScript, this doesn't come without a few caveats, e.g:
it's impossible to make this follow the iterator protocol (stuff like .next() etc.),
it's impossible to iterate through several generators at once,
you have to watch out not to let the wrong objects leave the generator's scope (e.g. by calling yield in a timeout – since this is "plain" JavaScript, there's no syntax restriction that prevents you from doing this),
exceptions in the generator are a little tricky,
and last not least, it's very experimental (just started this a few days ago).
On the bright side, you have yield! :)
The Fibonacci example from the MDC page would look like this:
var fibonacci = Generator(function () {
var fn1 = 1;
var fn2 = 1;
while (1){
var current = fn2;
fn2 = fn1;
fn1 = fn1 + current;
this.yield(current);
}
});
console.log(fibonacci.take(10).toArray());
Output:
[1, 1, 2, 3, 5, 8, 13, 21, 34, 55]
The project is on BitBucket at https://bitbucket.org/balpha/lyfe.

Develop Reference

JavaScript is the programming language of the Web.

javascript V8 optimisation and "leaking arguments" - javascript

Related

transpiler battle: breaking out of nested function, with vs without throw

Performance implication in declaring variables JavaScript

Javascript log attribute access and function calls [duplicate]

Can I override the Javascript Function object to log all function calls?

How to simulate JavaScript yield?

Categories

Resources