I have a minified javascript file. I can send it through a variety of tools to insert newlines and indentation. What I then want is to fix the variable names. I know that no tool can do this automatically. What I need is a tool that will augment my attempt to do so manually. It needs to be aware of scope rules so that when I rename c to length and d to height and j() to move(), those changes will be made everywhere that the same c and d and j are used, but not in other scopes where different variables and functions with those same names exist.
Is there a tool like this, specifically designed for reversing minification? If there isn't a tool for this specific job, is there a smart IDE that can at least handle renaming variables or methods following scope rules?
I found an in-browser/downloadable node.js library tool that does rename refactoring very well. Esprima handles the following ES6 code (slightly modified from the example) such that when I change the name of any of the global scope hi, only the hi names surrounded by a comment block are changed (I couldn't think of a better way to call out code since markdown doesn't show in code blocks, sorry).
// Array shuffling code from Underscore.js, code modified from base example on http://esprima.org/demo/rename.html
var shuffled;
var /*hi*/ = 'hi'; // initial declaration
function /*hi*/() { // modifies var above
this.shuffle = 'x';
}
_.shuffle = function(obj) {
function hi() {
return 'hello';
}
var shuffled = [], rand;
each(obj, function(value, index, list) {
rand = Math.floor(Math.random() * (index + 1));
shuffled[index] = shuffled[rand];
shuffled[rand] = value;
});
hi(); // calls function defined above, name not changed by change to global scope 'hi'
console.log(hello); // considered to be the same as the let statement below even though this is a syntax error since let doesn't get hoisted like var
return shuffled;
};
let hello = 'hello';
function hiNotInScope() {
var hi = 'something else'; // not in global scope, so change to global hi doesn't change this
console.log(hi); // changed if the hi in this scope is changed
}
// hi (not changed since it's in a comment)
/*hi*/(); // calls global hi function.
It seems to respect scoping rules as long as there are no code errors (for example, a let declaration above a var declaration of the same name that gets hoisted above the let will be considered in-scope, but this is a non-issue since it's a syntax error).
Related
Learning about IIFEs i understand that using them does:
not pollute the global scope
shields your code from others.
Could someone please help me understand it better and give a real example of when i could get in trouble when using regular function statement (instead od IIFE?)
Let's say you have a decent sized codebase (a few thousand lines or more). Putting every function on the top level won't be a good idea, because then it'll be pretty easy to accidentally write a function name twice, which will cause bugs due to the name collision. For example:
// code around validating user registration
function validate(username) {
return /someValidationRegex/.test(username);
}
// do stuff with validate
// And then, far elsewhere in the code,
// you need to validate an input for sending to the server:
function validate(input) {
return /someOtherValidationRegex/.test(input);
}
// do stuff with validate
This will not work, because the last validate function will overwrite the first validate function, and the code won't work as expected.
Put each segment of code into an IIFE instead, to avoid the possibility of name collisions:
(() => {
// code around validating user registration
function validate(username) {
return /someValidationRegex/.test(username);
}
// do stuff with validate
})();
(() => {
// code around validating an input for sending to the server:
function validate(input) {
return /someOtherValidationRegex/.test(input);
}
// do stuff with validate
})();
This is the technique that Stack Overflow's Javascript uses (at least in some parts).
Another reason to use IIFEs even if you're careful not to duplicate function names is that you may accidentally duplicate a window property. For example, the following is a somewhat common bug people run into:
// name is defined to be a number...
var name = 5;
// but it's actually a string???
console.log(typeof name);
The problem here is that you're accidentally referring to / overwriting the window.name property on the top level, and window.name may only be a string. There are a whole bunch of functions and variables defined on the top level. To avoid name collisions, put everything into an IIFE instead:
(() => {
// name is defined to be a number...
var name = 5;
// And it's still a number, good
console.log(typeof name);
})();
Still, if your codebase is large enough that name collisions are a decent possibility, rather than manually writing IIFEs, I'd highly recommend using a module system instead, like with Webpack. This will allow you to write modules inside their own encapsulated scope, without leakage or name collisions, and will be more maintainable, when each module only contains exactly the code it needs to run, and nothing more. This makes identifying bugs and extending features much easier than one huge long <script> that you have to navigate through manually.
The common advantage of IIFE is that any "Function or Variable" defined inside IIFE, cannot be accessed outside the IIFE block, thus preventing global scope from getting polluted.
IIFE's are used to define and execute functions on fly and use them on the spot without extra line of Code.
(function(){
var firstName = 'Jhon';
console.log(firstName);
})(); // will be executed on the fly
function test(){
var firstName = 'Jhon';
console.log(firstName);
}
test(); // has to call manually
The IIFE will actually run (immediately-invoked function expression), they don't need a trigger or function call to initiate them, so the variable will be set to its response.
Here's a Fiddle to demonstrate this:
var iffe = (function() {
return 'foo';
})();
var func = function() {
return 'bar';
};
console.log('iife = ' + iffe);
console.log('non-iife = ' + func);
In your JS console you'll see something similar to:
iife = foo and
non-iife = function () {
return 'bar'; }
Also in case we need to pass something in the IFFE scope from the outer scope we need to pass them as a parameter and receive this inside the function defined in the IFFE wrap, something like this -:
var iffe = (function(doc) {
return 'foo';
})(document);
This is a link for some more reference - http://benalman.com/news/2010/11/immediately-invoked-function-expression/
Learning JS here, I run JSLint on this code:
/*jslint devel: true*/
function Animal(n) {
"use strict";
this.name = n;
this.prop = "a";
}
var a = new Animal("duppa");
for (v in a) {
if (a.hasOwnProperty(v)) {
console.log("property:" + v);
}
}
I get:
jslint:test2.js:11:6:'v' was used before it was defined.
jslint:test2.js:11:8:Cannot read property "kind" from undefined
jslint: ignored 0 errors.
It obviously complains that I did not declare v up front:
/*jslint devel: true*/
function Animal(n) {
"use strict";
this.name = n;
this.prop = "a";
}
var a = new Animal("duppa");
var v;
for (v in a) {
if (a.hasOwnProperty(v)) {
console.log("property:" + v);
}
}
This shuts JSLint up, but is it really necessary? In general I try to follow good conventions but is this one really necessary in JS? E.g. Python lives happily without such stuff (for x in range(10)... etc).
Yes, you absolutely should declare the variable. Otherwise you're declaring v at global scope which is never good, but it's particularly bad for counting variables which are a single letter long like v.
Consider the case where two people get lazy about declaring a variable of the same name:
// Your code, which iterates over a nested array
var arr = [ [1, 2], [3, 4] ];
for (i = 0; i < arr.length; ++i) {
AwesomeLibrary.doSomething(arr[i]);
}
// way down elsewhere, in awesome_library.js...
function doSomething(arr) {
for (i = 0; i < arr.length; ++i) {
// Now your `i` is clobbered, and we have a subtle but devastating bug
}
}
This doesn't even require two lazy people: If you work with JavaScript long enough and refuse to declare your variables, you will eventually do this to yourself.
There are 10 types of people in the world. Those who understand why you declare variables in javascript and those who have regular sex. (Just smile)
You must understand that every function have their own scope and you must use this scope. If you don't use declaration inside your function you change the global state, and it affects of course on many things.
So use var and don't create global variables !!!
It is always a better practice to define your variables before using them. Here javascript's for loop requires you to define i, because it is in the for loop, use
var v;
for( v in a)
As far as my understanding goes, if you use a variable without declaring it within a function scope, it gets declared in the global scope, which isnt a best practice. You will soon cloud your global scope with variables.
//This creates a global variable
function not_a_best_practice(){
a=10;
}
//This creates a local variable
function not_bad(){
var a=20;
}
This answer may throw more light on the discussion at hand:
What is the scope of variables in JavaScript?
Remember that JSLint doesn't complain because of some obscure academic reason but because it protects you from major problems your application may potentially suffer from.
You declare a variable to contain the variable in the current scope. It's a safeguard for you and your program as well making it more readable code (global vars appearing out of nowhere are always confusing).
Imagine if you had a global variable v in your application and then used the same nomenclature (v) for iteration in a function. Javascript will automatically assume that the global variable is being used and you'll find yourself with unwanted values in your global. Having said that, the less stuff you put in the global namespace the better.
This question already has answers here:
Why use the javascript function wrapper (added in coffeescript) ".call(this)"
(2 answers)
Closed 8 years ago.
While reviewing the source code for CoffeeScript on Github, I noticed that most, if not all, of the modules are defined as follows:
(function() {
...
}).call(this);
This pattern looks like it wraps the entire module in an anonymous function and calls itself.
What are the pros (and cons) of this approach? Are there other ways to accomplish the same goals?
Harmen's answer is quite good, but let me elaborate a bit on where this is done by the CoffeeScript compiler and why.
When you compile something with coffee -c foo.coffee, you will always get a foo.js that looks like this:
(function() {
...
}).call(this);
Why is that? Well, suppose you put an assignment like
x = 'stringy string'
in foo.coffee. When it sees that, the compiler asks: Does x already exist in this scope, or an outer scope? If not, it puts a var x declaration at the top of that scope in the JavaScript output.
Now suppose you write
x = 42
in bar.coffee, compile both, and concatenate foo.js with bar.js for deployment. You'll get
(function() {
var x;
x = 'stringy string';
...
}).call(this);
(function() {
var x;
x = 42;
...
}).call(this);
So the x in foo.coffee and the x in bar.coffee are totally isolated from one another. This is an important part of CoffeeScript: Variables never leak from one .coffee file to another unless explicitly exported (by being attached to a shared global, or to exports in Node.js).
You can override this by using the -b ("bare") flag to coffee, but this should only be used in very special cases. If you used it with the above example, the output you'd get would be
var x;
x = 'stringy string';
...
var x;
x = 42;
...
This could have dire consequences. To test this yourself, try adding setTimeout (-> alert x), 1 in foo.coffee. And note that you don't have to concatenate the two JS files yourself—if you use two separate <script> tags to include them on a page, they still effectively run as one file.
By isolating the scopes of different modules, the CoffeeScript compiler saves you from the headache of worrying whether different files in your project might use the same local variable names. This is common practice in the JavaScript world (see, for instance, the jQuery source, or just about any jQuery plugin)—CoffeeScript just takes care of it for you.
The good thing about this approach is that it creates private variables, so there won't be any conflict with variable names:
(function() {
var privateVar = 'test';
alert(privateVar); // test
})();
alert(typeof privateVar); // undefined
The addition of .call(this) makes the this keyword refer to the same value as it referred to outside the function. If it is not added, the this keyword will automatically refer to the global object.
A small example to show the difference follows:
function coffee(){
this.val = 'test';
this.module = (function(){
return this.val;
}).call(this);
}
var instance = new coffee();
alert(instance.module); // test
function coffee(){
this.val = 'test';
this.module = (function(){
return this.val;
})();
}
var instance = new coffee();
alert(typeof instance.module); // undefined
This is similar syntax to this:
(function() {
}());
which is called an immediate function. the function is defined and executed immediately. the pros to this is that you can place all of your code inside this block, and assign the function to a single global variable, thus reducing global namespace pollution. it provides a nice contained scope within the function.
This is the typical pattern i use when writing a module:
var MY_MODULE = (function() {
//local variables
var variable1,
variable2,
_self = {},
etc
// public API
_self = {
someMethod: function () {
}
}
return _self;
}());
not sure what the cons might be exactly, if someone else knows of any i would be happy to learn about them.
What is the 'best practise' with regard to coding style.
Should I use _ for private members?
Should I use this._privateMember?
Please re-write my code in proper style if its wrong:
(function()){
var _blah = 1;
someFunction = function() {
alert(_blah);
};
someOtherFunction = function {
someFunction();
}
}();
I would read this and incorporate any part you agree with:
http://javascript.crockford.com/code.html
You do not have to agree with all of it
I don't think there is one. Use a prefix if you think it helps.
I use _ for private members because it distinguishes them which can be quite helpful in Javascript when you have variables coming from all over the place. But they do clutter the code a bit.
I don't use _ for variables that are declared using var. I do however, use _ to denote object members that shouldn't be access directly.
Some people (who are strange in my opinion :P), also prefix variables with a $ if it contains a jQuery object.
As long as you are consistent with what you do, there are no problems.
Aside from just the code you're showing now, you should use Capital Letters to distinguish constructor functions, and camelCase to name instances of objects.
function Foo (val) {
this.set(val);
};
Foo.prototype.get = function () {
return this._dontTouchMePlease;
};
Foo.prototype.set = function(val) {
this._dontTouchMePlease = parseInt(val, 10);
};
var aFoo = new Foo(6);
I think that its generally accepted that if a variable name starts with a _, you probably shouldn't touch it (accept in dire cirumcstances and even then, two keys and special codes should be provided).
If I'm remembering my Crockford correctly, you'll want to put var in front of the two inner functions, otherwise they will be implicit globals. If you want them to be globals, then that's moot. Either way, your second inner function declaration should probably end in a semicolon. This might be a misformating thing, but I think its generally accepted that the bodies of functions are indented one more level in. Also, I've never seen the (function()){/* stuff */}(); construction before, but that says pretty much nothing.
I'd write it these ways - one for if your just declaring a function and another for if your using an anonymous function and immediately applying it to get a result, because I don't which one you're trying to do (again, if you want the inner functions to be global, then this won't be what you intended):
//function declaration
var myFunction = function () {
var _blah = 1;
var someFunction () {
alert(_blah); //or console.log(_blah); for debugging purposes
};
var someOtherFunction () {
someFunction();
};
};
//using a one-of to assign a result
/* NOTE: if you are using this version, myResult will be undefined
(at least given the functions you provided), but like I said,
I don't recognize the construction you provided, and am therefore
assuming that you meant one of these two, which could be a perfectly
falacious assumption, and in that case, my apologies
*/
var myResult = function () {
var _blah = 1;
var someFunction () {
alert(_blah);
};
var someOtherFunction () {
someFunction();
};
}();
BTW, (and I don't want to overstep) Crockford's "JavaScript: The Good Parts" is a stellar reference for JavaScript syntax. He also has, on his website a JavaScript style guide of sorts (though I don't know how widely followed it is). Link is: http://javascript.crockford.com/code.html
I also use the "_" in c# for private/protected members. It is a fast way to see if the variable is a member-variable or not. And you can access it faster with code-completion because you don't get in mess with the public members (private: _blah , public property: Blah).
But are there any private members in javascript anyway? I think every variable defined as member is accessible from the outside.
So you don't have a public wrapper for the private member. And that means the "_" is a bit a overhead and the same can be achieved with "this.".
I prefer you to use the following stuffs which is preferably used around the world programmers.. see below
i = Int
f = Float
o = Object
r = Return
a = Array
e = Element
g = Global declaration
hook = a function which can be used for hooking with other functions
call = a function which can be used for making call from client to server system
sync = a function which can be used for SYNC
and so on.. you can prefix on your coding...
I am now in the process of removing most globals from my code by enclosing everything in a function, turning the globals into "pseudo globals," that are all accessible from anywhere inside that function block.
(function(){
var g = 1;
var func f1 = function () { alert (g); }
var func f2= function () { f1(); }
})();
(technically this is only for my "release version", where I append all my files together into a single file and surround them with the above....my dev version still has typically one global per js file)
This all works great except for one thing...there is one important place where I need to access some of these "globals" by string name. Previously, I could have done this:
var name = "g";
alert (window[name]);
and it did the same as
alert(g);
Now -- from inside the block -- I would like to do the same, on my pseudo-globals. But I can't, since they are no longer members of any parent object ("window"), even though are in scope.
Any way to access them by string?
Thanks...
Basically no, as answered indirectly by this question: Javascript equivalent of Python's locals()?
Your only real option would be to use eval, which is usually not a good or even safe idea, as described in this question: Why is using the JavaScript eval function a bad idea?
If the string name of those variables really and truly is defined in a safe way (e.g. not through user-input or anything), then I would recommend just using eval. Just be sure to think really long and hard about this and whether there is not perhaps a better way to do this.
You can name the function you are using to wrap the entire code.
Then set the "global" variable as a member of that function (remember functions are objects in JavaScript).
Then, you can access the variable exactly as you did before....just use the name of the function instead of "window".
It would look something like this:
var myApp = new (function myApp(){
this.g = "world";
//in the same scope
alert ( "Hello " + this["g"]);
})();
//outside
alert ( "Hello " + myApp["g"]);
if you want to access something in a global scope, you have to put something out there. in your case it's probably an object which references your closed off function.
var obj1 = new (function(){
var g = 1;
var func f1 = function () { alert (g); }
var func f2= function () { f1(); }
})();
you can add a method or property as a getter for g. if the value of g isn't constant you might do like
this.getG = function() { return g; };
you can work from there to access items by name, like
alert( obj1["getG"]() );
alert( window["obj1"]["getG"]() );