Good resources for extreme minified JavaScript (js1k-style)

Good resources for extreme minified JavaScript (js1k-style) - javascript

As I'm sure most of the JavaScripters out there are aware, there's a new, Christmas-themed js1k. I'm planning on entering this time, but I have no experience producing such minified code. Does anyone know any good resources for this kind of thing?

Google Closure Compiler is a good javascript minifier.
There is a good online tool for quick use, or you can download the tool and run it as part of a web site build process.
Edit: Added a non-exhaustive list of tricks that you can use to minify JavaScript extremely, before using a minifier:
Shorten long variable names
Use shortened references to built in variables like d=document;w=window.
Set Interval
The setInterval function can take either a function or a string. Pass in a string to reduce the number of characters used: setInterval('a--;b++',10). Note that passing in a string forces an eval invokation so it will be slower than passing in a function.
Reduce Mathematical Calculations
Example a=b+b+b can be reduced to a=3*b.
Use Scientific Notation
10000 can be expressed in scientific notation as 1E4 saving 2 bytes.
Drop leading Zeroes
0.2 = .2 saves a byte
Ternery Operator
if (a > b) {
result = x;
}
else {
result = y;
}
can be expressed as result=a>b?x:y
Drop Braces
Braces are only required for blocks of more than one statement.
Operator Precedence
Rely on operator precedence rather than adding unneeded brackets which aid code readability.
Shorten Variable Assignment
Rather than function x(){a=1,b=2;...}() pass values into the function, function x(a,b){...}(1,2)
Think outside the box
Don't automatically reach for standard ways of doing things. Rather than using d.getElementById('p') to get a reference to a DOM element, could you use b.children[4] where d=document;b=body.
Original source for above list of tricks:
http://thingsinjars.com/post/293/the-quest-for-extreme-javascript-minification/

Spolto is right.
Any code minifier won't do the trick alone. You need to first optimize your code and then make some dirty manual tweaks.
In addition to Spolto's list of tricks I want to encourage the use of logical operators instead of the classical if else syntax. ex:
The following code
if(condition){
exp1;
}else{
exp2;
}
is somewhat equivalent to
condition&&exp1||exp2;
Another thing to consider might be multiple variable declaration :
var a = 1;var b = 2;var c = 1;
can be rewritten as :
var a=c=1,b=2;
Spolto is also right about the braces. You should drop them. But in addition, you should know that they can be dropped even for blocks of more expressions by writing the expressions delimited by a comma(with a leading ; of course) :
if(condition){
exp1;
exp2;
exp3;
}else{
exp4;
exp5;
}
Can be rewritten as :
if(condition)exp1,exp2,exp3;
else exp4,exp5;
Although it's not much (it saves you only 1 character/block for those who are counting), it might come in handy. (By the way, the latest Google Closure Compiler does this trick too).
Another trick worth mentioning is the controversial with functionality.
If you care more about the size, then you should use this because it might reduce code size.
For example, let's consider this object method:
object.method=function(){
this.a=this.b;
this.c++;
this.d(this.e);
}
This can be rewritten as :
object.method=function(){
with(this){
a=b;
c++;
d(e);
}
}
which is in most cases signifficantly smaller.
Something that most code packers & minifiers do not do is replacing large repeating tokens in the code with smaller ones. This is a nasty hack that also requires the use of eval, but since we're in it for the space, I don't think that should be a problem. Let's say you have this code :
a=function(){/*code here*/};
b=function(){/*code here*/};
c=function(){/*code here*/};
/*...*/
z=function(){/*code here*/};
This code has many "function" keywords repeating. What if you could replace them with a single(unused) character and then evaluate the code?
Here's how I would do it :
eval('a=F(){/*codehere*/};b=F(){/*codehere*/};c=F(){/*codehere*/};/*...*/z=F(){/*codehere*/};'.replace(/function/g,'F'));
Of course the replaced token(s) can be anything since our code is reduced to an evaluated string (ex: we could've replaced =function(){ with F, thus saving even more characters).
Note that this technique must be used with caution, because you can easily screw up your code with multiple text replacements; moreover, you should use it only in cases where it helps (ex: if you only have 4 function tokens, replacing them with a smaller token and then evaluating the code might actually increase the code length :
var a = "eval(''.replace(/function/g,'F'))".length,
b = ('function'.length-'F'.length)*4;
alert("you should" + (a<b?"":" NOT") + " use this technique!");

In the following link you'll find surprisingly good tricks to minify js code for this competition:
http://www.claudiocc.com/javascript-golfing/
One example: (extracted from section Short-circuit operators):
if (p) p=q; // before
p=p&&q; // after
if (!p) p=q; // before
p=p||q; // after
Or the more essoteric Canvas context hash trick:
// before
a.beginPath
a.fillRect
a.lineTo
a.stroke
a.transform
a.arc
// after
for(Z in a)a[Z[0]+(Z[6]||Z[2])]=a[Z];
a.ba
a.fc
a.ln
a.sr
a.to
a.ac
And here is another resource link with amazingly good tricks: https://github.com/jed/140bytes/wiki/Byte-saving-techniques

First off all, just throwing your code into a minifier won't help you that much. You need to have the extreme small file size in mind when you write the code. So in part, you need to learn all the tricks yourself.
Also, when it comes to minifiers, UglifyJS is the new shooting star here, its output is smaller than GCC's and it's way faster too. And since it's written in pure JavaScript it should be trivial for you to find out what all the tricks are that it applies.
But in the end it all comes down to whether you can find an intelligent, small solution for something that's awsome.

Also:
Dean Edwards Packer
http://dean.edwards.name/packer/
Uglify JS
http://marijnhaverbeke.nl/uglifyjs

A friend wrote jscrush packer for js1k.
Keep in mind to keep as much code self-similar as possible.
My workflow for extreme packing is: closure (pretty print) -> hand optimizations, function similarity, other code similarity -> closure (whitespace only) -> jscrush.
This packs away about 25% of the data.
There's also packify, but I haven't tested that myself.

This is the only online version of #cowboy's packer script:
http://iwantaneff.in/packer/
Very handy for packing / minifying JS

Related

Javascript shorthand loading time

I was playing around so I ran into a JS shorthands. I know they, of course, do not change code however do they lower loading time since there is less data?
Testing codes such as one below in Chrome DOM inspector did not give me an answer (probably because they are one-line codes so it does not make any difference).
if (x == 0) {x=1} else {x=2}
x == 0 ? x = 1 : x = 2;

If your goal is to optimize the speed with which your page loads by minimizing the size of your JS payload, there are lots of tools that will automatically rebuild your files into a single bundle that is compressed (i.e., all unnecessary whitespace removed, variables/functions renamed to shorter lengths, etc.). When it comes to writing code, you should always value readability first.
Write code that other people can easily understand. Then, when you're ready to deploy, look into a tool like UglifyJS2, which will enable you to take code like this:
function square(numToSquare) {
var squareProduct = numToSquare * numToSquare;
return squareProduct;
}
square(15);
..and turn it into this:
function square(r){return r*r}square(15);

The less characters and whitespace in a file, the lower the download size of said scripts is.
Readability is also a matter of utmost importance though, and ternary operators can be confusing in certain scenarios.
I would recommend that for those cases where you expect your codebase to increase to a certain extend over time, you stick to more readable constructs and use a minification/uglification process to lower file size.

Translating conditional statements in string format to actual logic

I have a good knowledge of real time graphics programming and web development, and I've started a project that requires me to take a user-created conditional string and actually use those conditions in code. This is an entirely new kind of programming problem for me.
I've tried a few experiments using loops and slicing up the conditional string...but I feel like I am missing some kind of technique that would make this more efficient and straightforward. I have a feeling regular expressions may be useful here, but perhaps not.
Here is an example string:
"IF#VAR#>=2AND$VAR2$==1OR#VAR3#<=3"
The values for those actual variables will come from an array of objects. Also, the different marker symbols around the variables denote different object arrays where the actual value can be found (variable name is an index).
I have complete control over how the conditional string is formatted (adding symbols around IF/ELSE/ELSEIF AND/OR
well as special symbols around the different operands) so my options are fairly open. How would you approach such a programming problem?

The problem you're facing is called parsing and there are numerous solutions to it. First, you can write your own "interpreter" for your mini-language, including lexer (which splits the string into tokens), parser (which builds a tree structure from a stream of tokens) and executor, which walks the tree and computes the final value. Or you can use a parser generator like PEG and have the whole thing built for you automatically - you just provide the rules of your language. Finally, you can utilize javascript built-in parser/evaluator eval. This is by far the simplest option, but eval only understands javascript syntax - so you'll have to translate your language to javascript before eval'ing it. And since eval can run arbitrary code, it's not for use in untrusted environments.
Here's an example on how to use eval with your sample input:
expr = "#VAR#>=2AND$VAR2$==1OR#VAR3#<=3"
vars = {
"#": {"VAR":5},
"$": {"VAR2":1},
"#": {"VAR3":7}
}
expr = expr.replace(/([##$])(\w+)(\1)/g, function($0, $1, $2) {
return "vars['" + $1 + "']." + $2;
}).replace(/OR/g, "||").replace(/AND/g, "&&")
result = eval(expr) // returns true

Why is String.replace() with lambda slower than a while-loop repeatedly calling RegExp.exec()?

One problem:
I want to process a string (str) so that any parenthesised digits (matched by rgx) are replaced by values taken from the appropriate place in an array (sub):
var rgx = /\((\d+)\)/,
str = "this (0) a (1) sentence",
sub = [
"is",
"test"
],
result;
The result, given the variables declared above, should be 'this is a test sentence'.
Two solutions:
This works:
var mch,
parsed = '',
remainder = str;
while (mch = rgx.exec(remainder)) { // Not JSLint approved.
parsed += remainder.substring(0, mch.index) + sub[mch[1]];
remainder = remainder.substring(mch.index + mch[0].length);
}
result = (parsed) ? parsed + remainder : str;
But I thought the following code would be faster. It has fewer variables, is much more concise, and uses an anonymous function expression (or lambda):
result = str.replace(rgx, function() {
return sub[arguments[1]];
});
This works too, but I was wrong about the speed; in Chrome it's surprisingly (~50%, last time I checked) slower!
...
Three questions:
Why does this process appear to be slower in Chrome and (for example) faster in Firefox?
Is there a chance that the replace() method will be faster compared to the while() loop given a bigger string or array? If not, what are its benefits outside Code Golf?
Is there a way optimise this process, making it both more efficient and as fuss-free as the functional second approach?
I'd welcome any insights into what's going on behind these processes.
...
[Fo(u)r the record: I'm happy to be called out on my uses of the words 'lambda' and/or 'functional'. I'm still learning about the concepts, so don't assume I know exactly what I'm talking about and feel free to correct me if I'm misapplying the terms here.]

Why does this process appear to be slower in Chrome and (for example) faster in Firefox?
Because it has to call a (non-native) function, which is costly. Firefox's engine might be able to optimize that away by recognizing and inlining the lookup.
Is there a chance that the replace() method will be faster compared to the while() loop given a bigger string or array?
Yes, it has to do less string concatenation and assignments, and - as you said - less variables to initialize. Yet you can only test it to prove my assumptions (and also have a look at http://jsperf.com/match-and-substitute/4 for other snippets - you for example can see Opera optimizing the lambda-replace2 which does not use arguments).
If not, what are its benefits outside Code Golf?
I don't think code golf is the right term. Software quality is about readabilty and comprehensibility, in whose terms the conciseness and elegance (which is subjective though) of the functional code are the reasons to use this approach (actually I've never seen a replace with exec, substring and re-concatenation).
Is there a way optimise this process, making it both more efficient and as fuss-free as the functional second approach?
You don't need that remainder variable. The rgx has a lastIndex property which will automatically advance the match through str.

Your while loop with exec() is slightly slower than it should be, since you are doing extra work (substring) as you use exec() on a non-global regex. If you need to loop through all matches, you should use a while loop on a global regex (g flag enabled); this way, you avoid doing extra work trimming the processed part of the string.
var rgR = /\((\d+)\)/g;
var mch,
result = '',
lastAppend = 0;
while ((mch = rgR.exec(str)) !== null) {
result += str.substring(lastAppend, mch.index) + sub[mch[1]];
lastAppend = rgR.lastIndex;
}
result += str.substring(lastAppend);
This factor doesn't disturb the performance disparity between different browser, though.
It seems the performance difference comes from the implementation of the browser. Due to the unfamiliarity with the implementation, I cannot answer where the difference comes from.
In terms of power, exec() and replace() have the same power. This includes the cases where you don't use the returned value from replace(). Example 1. Example 2.
replace() method is more readable (the intention is clearer) than a while loop with exec() if you are using the value returned by the function (i.e. you are doing real replacement in the anonymous function). You also don't have to reconstruct the replaced string yourself. This is where replace is preferred over exec(). (I hope this answers the second part of question 2).
I would imagine exec() to be used for the purposes other than replacement (except for very special cases such as this). Replacement, if possible, should be done with replace().
Optimization is only necessary, if performance degrades badly on actual input. I don't have any optimization to show, since the 2 only possible options are already analyzed, with contradicting performance between 2 different browser. This may change in the future, but for now, you can choose the one that has better worst-performance-across-browser to work with.

Sorting Special Characters in Javascript (Æ)

I'm trying to sort an array of objects based on the objects' name property. Some of the names start with 'Æ', and I'd like for them to be sorted as though they were 'Ae'. My current solution is the following:
myArray.sort(function(a, b) {
var aName = a.name.replace(/Æ/gi, 'Ae'),
bName = b.name.replace(/Æ/gi, 'Ae');
return aName.localeCompare(bName);
});
I feel like there should be a better way of handling this without having to manually replace each and every special character. Is this possible?
I'm doing this in Node.js if it makes any difference.

There is no simpler way. Unfortunately, even the way described in the question is too simple, at least if portability is of any concern.
The localeCompare method is by definition implementation-dependent, and it usually depends on the UI language of the underlying operating system, though it may also differ between browsers (or other JavaScript implementations) in the same computer. It can be hard to find any documentation on it, so even if you aim at writing non-portable code, you might need to do a lot of testing to see which collation order is applied. Cf. to Sorting strings is much harder than you thought!
So to have a controlled and portable comparison, you need to code it yourself, unless you are lucky enough to find someone else’s code that happens to suit your needs. On the positive side, the case conversion methods are one of the few parts of JavaScript that are localization-ready: they apply Unicode case mapping rules, so e.g. 'æ'.toUpperCase() yields Æ in any implementation.
In general, sorting strings requires a complicated function that applies specific sorting rules as defined for a language or by some other rules, such as the Pan-European sorting rules (intended for multilingual content). But if we can limit ourselves to sorting rules that deal with just a handful of letters in addition to Ascii, we can use code like the following simplified sorting for German (extract from by book Going Global with JavaScript and Globalize.js):
String.prototype.removeUmlauts = function () {
return this.replace(/Ä/g,'A').replace(/Ö/g,'O').replace(/Ü/g,'U');
};
function alphabetic(str1, str2) {
var a = str1.toUpperCase().removeUmlauts();
var b = str2.toUpperCase().removeUmlauts();
return a < b ? -1 : a > b ? 1 : 0;
}
You could adds other mappings, like replace(/Æ/gi, 'Ae'), to this, after analyzing the characters that may appear and deciding how to deal with them. Removing diacritic marks (e.g. mapping É to E) is simplistic but often good enough, and surely better than leaving it to implementations to decide whether É is somewhere after Z. And at least you would get consistent results across implementations, and you would see what things go wrong and need fixing, instead of waiting for other users complain that your code sorts all wrong (in their environment).

I need a Javascript literal syntax converter/deobfuscation tools

I have searched Google for a converter but I did not find anything. Is there any tools available or I must make one to decode my obfuscated JavaScript code ?
I presume there is such a tool but I'm not searching Google with the right keywords.
The code is 3 pages long, this is why I need a tools.
Here is an exemple of the code :
<script>([][(![]+[])[!+[]+!+[]+!+[]]+(!![]+[][(![]+[])[+[]]+([![]]+[][[]])[+!+[]+[+[]]]+(![]+[])[!+[]+!+[]]+(!![]+[])[+[]]+(!![]+[])[!+[]+!+[]+!+[]]+(!![]+[])[+!+[]]])[+!+[]+[+[]]]+(!![]+[])[+!+[]]+(!![]+[])[+[]]][([][(![]+[])[+[]]+([![]]+[][[]])[+!+[]+[+[]]]+(![]+[])[!+[]+!+[]]+(!![]+[])[+[]]+(!![]+[])[!+[]+!+[]+!+[]]+(!![]+[])[+!+[]]]+[])[!+[]+!+[]+!+[]]+(![]+[])[+!+[]]+(![]+[])[!+[]+!+[]]+(![]+[])[!+[]+!+[]]]()[(!![]+[])[!+[]+!+[]+!+[]]+(+(+[])+[][(![]+[])[+[]]+([![]]+[][[]])[+!+[]+[+[]]]+(![]+[])[!+[]+!+[]]+(!![]+[])[+[]]+(!![]+[])[!+[]+!+[]+!+[]]+(!![]+[])[+!+[]]])[!+[]+!+[]+!+[]+[+[]]]+(![]+[])[+!+[]]+(![]+[])[!+[]+!+[]]])(([]+[])[([][(![]+[])[+[]]+([![]]+[][[]])[+!+[]+[+[]]]+(![]+[])[!+[]+!+[]]+(!![]+[])[+[]]+(!![]+[])[!+[]+!+[]+!+[]]+(!![]+[])[+!+[]]]+[])[!+[]+!+[]+!+[]]+(!![]+
Thank you

This code is fascinating because it seems to use only nine characters ("[]()!+,;" and empty space U+0020) yet has some sophisticated functionality. It appears to use JavaScript's implicit type conversion to coerce arrays into various primitive types and their string representations and then use the characters from those strings to compose other strings which type out the names of functions which are then called.
Consider the following snippet which evaluates to the array filter function:
([][
(![]+[])[+[]] // => "f"
+ ([![]]+[][[]])[+!+[]+[+[]]] // => "i"
+ (![]+[])[!+[]+!+[]] // => "l"
+ (!![]+[])[+[]] // => "t"
+ (!![]+[])[!+[]+!+[]+!+[]] // => "e"
+ (!![]+[])[+!+[]] // => "r"
]) // => function filter() { /* native code */ }
Reconstructing the code as such is time consuming and error prone, so an automated solution is obviously desirable. However, the behavior of this code is so tightly bound to the JavaScript runtime that de-obsfucating it seems to require a JS interpreter to evaluate the code.
I haven't been able to find any tools that will work generally with this sort of encoding. It seems as though you'll have to study the code further and determine any patterns of usage (e.g. reliance on array methods) and figure out how to capture their usage (e.g. by wrapping high-level functions [such as Function.prototype.call]) to trace the code execution for you.

This question has already an accepted answer, but I will still post to clear some things up.
When this idea come up, some guy made a generator to encode JavaScript in this way. It is based on doing []["sort"]["call"]()["eval"](/* big blob of code here */). Therefore, you can decode the results of this encoder easily by removing the sort-call-eval part (i.e. the first 1628 bytes). In this case it produces:
if (document.cookie=="6ffe613e2919f074e477a0a80f95d6a1"){ alert("bravo"); }
else{ document.location="http://www.youtube.com/watch?v=oHg5SJYRHA0"; }
(Funny enough the creator of this code was not even able to compress it properly and save a kilobyte)
There is also an explanation of why this code doesn't work in newer browser anymore: They changed Array.prototype.sort so it does not return a reference to window. As far as I remember, this was the only way to get a reference to window, so this code is kind of broken now.

Develop Reference

JavaScript is the programming language of the Web.

Good resources for extreme minified JavaScript (js1k-style) - javascript

As I'm sure most of the JavaScripters out there are aware, there's a new, Christmas-themed js1k. I'm planning on entering this time, but I have no experience producing such minified code. Does anyone know any good resources for this kind of thing?

Also: Dean Edwards Packer http://dean.edwards.name/packer/ Uglify JS http://marijnhaverbeke.nl/uglifyjs

This is the only online version of #cowboy's packer script: http://iwantaneff.in/packer/ Very handy for packing / minifying JS

Related

Javascript shorthand loading time

Translating conditional statements in string format to actual logic

Why is String.replace() with lambda slower than a while-loop repeatedly calling RegExp.exec()?

Sorting Special Characters in Javascript (Æ)

I need a Javascript literal syntax converter/deobfuscation tools

Categories

Resources