I need a Javascript literal syntax converter/deobfuscation tools - javascript

I have searched Google for a converter but I did not find anything. Is there any tools available or I must make one to decode my obfuscated JavaScript code ?
I presume there is such a tool but I'm not searching Google with the right keywords.
The code is 3 pages long, this is why I need a tools.
Here is an exemple of the code :
<script>([][(![]+[])[!+[]+!+[]+!+[]]+(!![]+[][(![]+[])[+[]]+([![]]+[][[]])[+!+[]+[+[]]]+(![]+[])[!+[]+!+[]]+(!![]+[])[+[]]+(!![]+[])[!+[]+!+[]+!+[]]+(!![]+[])[+!+[]]])[+!+[]+[+[]]]+(!![]+[])[+!+[]]+(!![]+[])[+[]]][([][(![]+[])[+[]]+([![]]+[][[]])[+!+[]+[+[]]]+(![]+[])[!+[]+!+[]]+(!![]+[])[+[]]+(!![]+[])[!+[]+!+[]+!+[]]+(!![]+[])[+!+[]]]+[])[!+[]+!+[]+!+[]]+(![]+[])[+!+[]]+(![]+[])[!+[]+!+[]]+(![]+[])[!+[]+!+[]]]()[(!![]+[])[!+[]+!+[]+!+[]]+(+(+[])+[][(![]+[])[+[]]+([![]]+[][[]])[+!+[]+[+[]]]+(![]+[])[!+[]+!+[]]+(!![]+[])[+[]]+(!![]+[])[!+[]+!+[]+!+[]]+(!![]+[])[+!+[]]])[!+[]+!+[]+!+[]+[+[]]]+(![]+[])[+!+[]]+(![]+[])[!+[]+!+[]]])(([]+[])[([][(![]+[])[+[]]+([![]]+[][[]])[+!+[]+[+[]]]+(![]+[])[!+[]+!+[]]+(!![]+[])[+[]]+(!![]+[])[!+[]+!+[]+!+[]]+(!![]+[])[+!+[]]]+[])[!+[]+!+[]+!+[]]+(!![]+
Thank you

This code is fascinating because it seems to use only nine characters ("[]()!+,;" and empty space U+0020) yet has some sophisticated functionality. It appears to use JavaScript's implicit type conversion to coerce arrays into various primitive types and their string representations and then use the characters from those strings to compose other strings which type out the names of functions which are then called.
Consider the following snippet which evaluates to the array filter function:
([][
(![]+[])[+[]] // => "f"
+ ([![]]+[][[]])[+!+[]+[+[]]] // => "i"
+ (![]+[])[!+[]+!+[]] // => "l"
+ (!![]+[])[+[]] // => "t"
+ (!![]+[])[!+[]+!+[]+!+[]] // => "e"
+ (!![]+[])[+!+[]] // => "r"
]) // => function filter() { /* native code */ }
Reconstructing the code as such is time consuming and error prone, so an automated solution is obviously desirable. However, the behavior of this code is so tightly bound to the JavaScript runtime that de-obsfucating it seems to require a JS interpreter to evaluate the code.
I haven't been able to find any tools that will work generally with this sort of encoding. It seems as though you'll have to study the code further and determine any patterns of usage (e.g. reliance on array methods) and figure out how to capture their usage (e.g. by wrapping high-level functions [such as Function.prototype.call]) to trace the code execution for you.

This question has already an accepted answer, but I will still post to clear some things up.
When this idea come up, some guy made a generator to encode JavaScript in this way. It is based on doing []["sort"]["call"]()["eval"](/* big blob of code here */). Therefore, you can decode the results of this encoder easily by removing the sort-call-eval part (i.e. the first 1628 bytes). In this case it produces:
if (document.cookie=="6ffe613e2919f074e477a0a80f95d6a1"){ alert("bravo"); }
else{ document.location="http://www.youtube.com/watch?v=oHg5SJYRHA0"; }
(Funny enough the creator of this code was not even able to compress it properly and save a kilobyte)
There is also an explanation of why this code doesn't work in newer browser anymore: They changed Array.prototype.sort so it does not return a reference to window. As far as I remember, this was the only way to get a reference to window, so this code is kind of broken now.

Related

What is the default “tag” function for template literals?

What is the name of the native function that handles template literals?
That is, I know that when you write tag`Foo ${'bar'}.`;, that’s just syntactic sugar for tag(['Foo ', '.'], 'bar');.¹
But what about just ​`Foo ${'bar'}.`;? I can’t just “call” (['Foo ', '.'], 'bar');. If I already have arguments in that form, what function should I pass them to?
I am only interested in the native function that implements the template literal functionality. I am quite capable of rolling my own, but the purpose of this question is to avoid that and do it “properly”—even if my implementation is a perfect match of current native functionality, the native functionality can change and I want my usage to still match. So answers to this question should take on one of the following forms:
The name of the native function to use, ideally with links to and/or quotes from documentation of it.
Links to and/or quotes from the spec that defines precisely what the implementation of this function is, so that if I roll my own at least I can be sure it’s up to the (current) specifications.
A backed-up statement that the native implementation is unavailable and unspecified. Ideally this is backed up by, again, links to and/or quotes from documentation, but if that’s unavailable, I’ll accept other sources or argumentation that backs this claim up.
Actually, the first argument needs a raw property, since it’s a TemplateStringsArray rather than a regular array, but I’m skipping that here for the sake of making the example more readable.
Motivation
I am trying to create a tag function (tag, say) that, internally, performs the default template literal concatenation on the input. That is, I am taking the TemplateStringsArray and the remaining arguments, and turning them into a single string that has already had its templating sorted out. (This is for passing the result into another tag function, otherTag perhaps, where I want the second function to treat everything as a single string literal rather than a broken up template.)
For example, tag`Something ${'cooked'}.`; would be equivalent to otherTag`Something cooked.`;.
My current approach
The definition of tag would look something like this:
function tag(textParts, ...expressions) {
const cooked = // an array with a single string value
const raw = // an array with a single string value
return otherTag({ ...cooked, raw });
}
Defining the value of raw is fairly straightforward: I know that String.raw is the tag function I need to call here, so const raw = [String.raw(textParts.raw, ...expressions)];.
But I cannot find anywhere on the internet what function I would call for the cooked part of it. What I want is, if I have tag`Something ${'cooked'}.`;, I want const cooked = `Something ${cooked}.`; in my function. But I can’t find the name of whatever function accomplishes that.
The closest I’ve found was a claim that it could be implemented as
const cooked = [expressions.map((exp, i) => textParts[i] + exp).join('')];
This is wrong—textParts may be longer than expressions, since tag`Something ${'cooked'}.`; gets ['Something ', '.'] and ['cooked'] as its arguments.
Improving this expression to handle that isn’t a problem:
const cooked = [
textParts
.map((text, i) => (i > 0 ? expressions[i-1] : '') + text)
.join(''),
];
But that’s not the point—I don’t want to roll my own here and risk it being inconsistent with the native implementation, particularly if that changes.
The name of the native function to use, ideally with links to and/or quotes from documentation of it.
There isn't one. It is syntax, not a function.
Links to and/or quotes from the spec that defines precisely what the implementation of this function is, so that if I roll my own at least I can be sure it’s up to the (current) specifications.
Section 13.2.8 Template Literals of the specification explains how to process the syntax.

Translating conditional statements in string format to actual logic

I have a good knowledge of real time graphics programming and web development, and I've started a project that requires me to take a user-created conditional string and actually use those conditions in code. This is an entirely new kind of programming problem for me.
I've tried a few experiments using loops and slicing up the conditional string...but I feel like I am missing some kind of technique that would make this more efficient and straightforward. I have a feeling regular expressions may be useful here, but perhaps not.
Here is an example string:
"IF#VAR#>=2AND$VAR2$==1OR#VAR3#<=3"
The values for those actual variables will come from an array of objects. Also, the different marker symbols around the variables denote different object arrays where the actual value can be found (variable name is an index).
I have complete control over how the conditional string is formatted (adding symbols around IF/ELSE/ELSEIF AND/OR
well as special symbols around the different operands) so my options are fairly open. How would you approach such a programming problem?
The problem you're facing is called parsing and there are numerous solutions to it. First, you can write your own "interpreter" for your mini-language, including lexer (which splits the string into tokens), parser (which builds a tree structure from a stream of tokens) and executor, which walks the tree and computes the final value. Or you can use a parser generator like PEG and have the whole thing built for you automatically - you just provide the rules of your language. Finally, you can utilize javascript built-in parser/evaluator eval. This is by far the simplest option, but eval only understands javascript syntax - so you'll have to translate your language to javascript before eval'ing it. And since eval can run arbitrary code, it's not for use in untrusted environments.
Here's an example on how to use eval with your sample input:
expr = "#VAR#>=2AND$VAR2$==1OR#VAR3#<=3"
vars = {
"#": {"VAR":5},
"$": {"VAR2":1},
"#": {"VAR3":7}
}
expr = expr.replace(/([##$])(\w+)(\1)/g, function($0, $1, $2) {
return "vars['" + $1 + "']." + $2;
}).replace(/OR/g, "||").replace(/AND/g, "&&")
result = eval(expr) // returns true

Simple way to check/validate javascript syntax

I have some big set of different javascript-snippets (several thousands), and some of them have some stupid errors in syntax (like unmatching braces/quotes, HTML inside javascript, typos in variable names).
I need a simple way to check JS syntax. I've tried JSLint but it send too many warnings about style, way of variable definitions, etc. (even if i turn off all flags). I don't need to find out style problems, or improve javascript quality, i just need to find obvious syntax errors. Of course i can simply check it in browser/browser console, but i need to do it automatically as the number of that snippets is big.
Add:
JSLint/JSHint reports a lot of problems in the lines that are not 'beauty' but working (i.e. have some potential problems), and can't see the real problems, where the normal compiler will simply report syntax error and stop execution. For example, try to JSLint that code, which has syntax errors on line 4 (unmatched quotes), line 6 (comma required), and line 9 (unexpected <script>).
document.write('something');
a = 0;
if (window.location == 'http://google.com') a = 1;
document.write("aaa='andh"+a+"eded"');
a = {
something: ['a']
something2: ['a']
};
<script>
a = 1;
You could try JSHint, which is less verbose.
Just in case anyone is still looking you could try Esprima,
It only checks syntax, nothing else.
I've found that SpiderMonkey has ability to compile script without executing it, and if compilation failed - it prints error.
So i just created small wrapper for SpiderMonkey
sub checkjs {
my $js = shift;
my ( $js_fh, $js_tmpfile ) = File::Temp::tempfile( 'XXXXXXXXXXXX', EXLOCK => 0, UNLINK => 1, TMPDIR => 1 );
$| = 1;
print $js_fh $js;
close $js_fh;
return qx(js -C -f $js_tmpfile 2>&1);
}
And javascriptlint.com also deals very good in my case. (Thanks to #rajeshkakawat).
Lots of options if you have an exhaustive list of the JSLint errors you do want to capture.
JSLint's code is actually quite good and fairly easy to understand (I'm assuming you already know JavaScript fairly well from your question). You could hack it to only check what you want and to continue no matter how many errors it finds.
You could also write something quickly in Node.js to use JSLint as-is to check every file/snippet quickly and output only those errors you care about.
Just use node --check filename
Semantic Designs' (my company) JavaScript formatter read JS files and formats them. You don't want the formatting part.
To read the files it will format, it uses a full JavaScript parser, which does a complete syntax check (even inside regular expressions). If you run it and simply ignore the formatted result, you get a syntax checker.
You can give it big list of files and it will format all of them. You could use this to batch-check your large set. (If there are any syntax errors, it returns a nonzero error status to a shell).

Calling toString on a javascript function returns source code

I just found out that when you call toString() on a javascript function, as in myFunction.toString(), the source code of that function is returned.
If you try it in the Firebug or Chrome console it will even go as far as formatting it nicely for you, even for minimized javascript files.
I don't know what is does for obfuscated files.
What's the use of such a toString implementation?
It has some use for debugging, since it lets you see the code of the function. You can check if a function has been overwritten, and if a variable points to the right function.
It has some uses for obfuscated javascript code. If you want to do hardcore obfuscation in javascript, you can transform your whole code into a bunch of special characters, and leave no numbers or letters. This technique relies heavily on being able to access most letters of the alphabet by forcing the toString call on everything with +""
example: (![]+"")[+[]] is f since (![]+"") evaluates to the string "false" and [+[]] evaluates to [0], thus you get "false"[0] which extracts the first letter f.
Some letters like v can only be accessed by calling toString on a native function like [].sort. The letter v is important for obfuscated code, since it lets you call eval, which lets you execute anything, even loops, without using any letters. Here is an example of this.
function.ToString - Returns a string representing the source code of the function. For Function objects, the built-in toString method decompiles the function back into the JavaScript source that defines the function.
Read this on mozilla.
You can use it as an implementation for multi-line strings in Javascript source.
As described in this blog post by #tjanczuk, one of the massive inconveniences in Javascript is multi-line strings. But you can leverage .toString() and the syntax for multi-line comments (/* ... */) to produce the same results.
By using the following function:
function uncomment(fn){
return fn.toString().split(/\/\*\n|\n\*\//g).slice(1,-1).join();
};
…you can then pass in multi-line comments in the following format:
var superString = uncomment(function(){/*
String line 1
String line 2
String line 3
*/});
In the original article, it was noted that Function.toString()'s behaviour is not standardised and therefore implementation-discrete — and the recommended usage was for Node.js (where the V8 interpreter can be relied on); however, a Fiddle I wrote seems to work on every browser I have available to me (Chrome 27, Firefox 21, Opera 12, Internet Explorer 8).
A nice use case is remoting. Just toString the function in the client, send it over the wire and execute it on the server.
My use case - I have a node program that processes data and produces interactive reports as html/js/css files. To generate a js function, my node code calls myfunc.toString() and writes it to a file.
You can use it to create a Web Worker from function defined in the main script:
onmessage = function(e) {
console.log('[Worker] Message received from main script:',e.data);
postMessage('Worker speaking.');
}
b = new Blob(["onmessage = " + onmessage.toString()], {type: 'text/javascript'})
w = new Worker(window.URL.createObjectURL(b));
w.onmessage = function(e) {
console.log('[Main] Message received from worker script:' + e.data);
};
w.postMessage('Main speaking.');

Good resources for extreme minified JavaScript (js1k-style)

As I'm sure most of the JavaScripters out there are aware, there's a new, Christmas-themed js1k. I'm planning on entering this time, but I have no experience producing such minified code. Does anyone know any good resources for this kind of thing?
Google Closure Compiler is a good javascript minifier.
There is a good online tool for quick use, or you can download the tool and run it as part of a web site build process.
Edit: Added a non-exhaustive list of tricks that you can use to minify JavaScript extremely, before using a minifier:
Shorten long variable names
Use shortened references to built in variables like d=document;w=window.
Set Interval
The setInterval function can take either a function or a string. Pass in a string to reduce the number of characters used: setInterval('a--;b++',10). Note that passing in a string forces an eval invokation so it will be slower than passing in a function.
Reduce Mathematical Calculations
Example a=b+b+b can be reduced to a=3*b.
Use Scientific Notation
10000 can be expressed in scientific notation as 1E4 saving 2 bytes.
Drop leading Zeroes
0.2 = .2 saves a byte
Ternery Operator
if (a > b) {
result = x;
}
else {
result = y;
}
can be expressed as result=a>b?x:y
Drop Braces
Braces are only required for blocks of more than one statement.
Operator Precedence
Rely on operator precedence rather than adding unneeded brackets which aid code readability.
Shorten Variable Assignment
Rather than function x(){a=1,b=2;...}() pass values into the function, function x(a,b){...}(1,2)
Think outside the box
Don't automatically reach for standard ways of doing things. Rather than using d.getElementById('p') to get a reference to a DOM element, could you use b.children[4] where d=document;b=body.
Original source for above list of tricks:
http://thingsinjars.com/post/293/the-quest-for-extreme-javascript-minification/
Spolto is right.
Any code minifier won't do the trick alone. You need to first optimize your code and then make some dirty manual tweaks.
In addition to Spolto's list of tricks I want to encourage the use of logical operators instead of the classical if else syntax. ex:
The following code
if(condition){
exp1;
}else{
exp2;
}
is somewhat equivalent to
condition&&exp1||exp2;
Another thing to consider might be multiple variable declaration :
var a = 1;var b = 2;var c = 1;
can be rewritten as :
var a=c=1,b=2;
Spolto is also right about the braces. You should drop them. But in addition, you should know that they can be dropped even for blocks of more expressions by writing the expressions delimited by a comma(with a leading ; of course) :
if(condition){
exp1;
exp2;
exp3;
}else{
exp4;
exp5;
}
Can be rewritten as :
if(condition)exp1,exp2,exp3;
else exp4,exp5;
Although it's not much (it saves you only 1 character/block for those who are counting), it might come in handy. (By the way, the latest Google Closure Compiler does this trick too).
Another trick worth mentioning is the controversial with functionality.
If you care more about the size, then you should use this because it might reduce code size.
For example, let's consider this object method:
object.method=function(){
this.a=this.b;
this.c++;
this.d(this.e);
}
This can be rewritten as :
object.method=function(){
with(this){
a=b;
c++;
d(e);
}
}
which is in most cases signifficantly smaller.
Something that most code packers & minifiers do not do is replacing large repeating tokens in the code with smaller ones. This is a nasty hack that also requires the use of eval, but since we're in it for the space, I don't think that should be a problem. Let's say you have this code :
a=function(){/*code here*/};
b=function(){/*code here*/};
c=function(){/*code here*/};
/*...*/
z=function(){/*code here*/};
This code has many "function" keywords repeating. What if you could replace them with a single(unused) character and then evaluate the code?
Here's how I would do it :
eval('a=F(){/*codehere*/};b=F(){/*codehere*/};c=F(){/*codehere*/};/*...*/z=F(){/*codehere*/};'.replace(/function/g,'F'));
Of course the replaced token(s) can be anything since our code is reduced to an evaluated string (ex: we could've replaced =function(){ with F, thus saving even more characters).
Note that this technique must be used with caution, because you can easily screw up your code with multiple text replacements; moreover, you should use it only in cases where it helps (ex: if you only have 4 function tokens, replacing them with a smaller token and then evaluating the code might actually increase the code length :
var a = "eval(''.replace(/function/g,'F'))".length,
b = ('function'.length-'F'.length)*4;
alert("you should" + (a<b?"":" NOT") + " use this technique!");
In the following link you'll find surprisingly good tricks to minify js code for this competition:
http://www.claudiocc.com/javascript-golfing/
One example: (extracted from section Short-circuit operators):
if (p) p=q; // before
p=p&&q; // after
if (!p) p=q; // before
p=p||q; // after
Or the more essoteric Canvas context hash trick:
// before
a.beginPath
a.fillRect
a.lineTo
a.stroke
a.transform
a.arc
// after
for(Z in a)a[Z[0]+(Z[6]||Z[2])]=a[Z];
a.ba
a.fc
a.ln
a.sr
a.to
a.ac
And here is another resource link with amazingly good tricks: https://github.com/jed/140bytes/wiki/Byte-saving-techniques
First off all, just throwing your code into a minifier won't help you that much. You need to have the extreme small file size in mind when you write the code. So in part, you need to learn all the tricks yourself.
Also, when it comes to minifiers, UglifyJS is the new shooting star here, its output is smaller than GCC's and it's way faster too. And since it's written in pure JavaScript it should be trivial for you to find out what all the tricks are that it applies.
But in the end it all comes down to whether you can find an intelligent, small solution for something that's awsome.
Also:
Dean Edwards Packer
http://dean.edwards.name/packer/
Uglify JS
http://marijnhaverbeke.nl/uglifyjs
A friend wrote jscrush packer for js1k.
Keep in mind to keep as much code self-similar as possible.
My workflow for extreme packing is: closure (pretty print) -> hand optimizations, function similarity, other code similarity -> closure (whitespace only) -> jscrush.
This packs away about 25% of the data.
There's also packify, but I haven't tested that myself.
This is the only online version of #cowboy's packer script:
http://iwantaneff.in/packer/
Very handy for packing / minifying JS

Categories

Resources