I am experiencing a classic JS case (in my opinion) but after a lot of googling, still not able to find a solution. Backslash is considered as a escape character in JS but what you do when you need to pass windows path from the JS and print it?
I am using eval because my java applet is executing the code and placing bits when it has a string to evaluate. That's why eval is necessary, however I have made an example which is below:
<div id="mainTabs"></div>
<script>
var s = "document.getElementById('mainTabs').innerHTML='\\C\ganye\file.doc'";
eval(s);
</script>
I tried double backslashes, not working, if anyone could help me get around this with as less hassle as possible, I will feel grateful.
Because you're using eval, the Javascript interpreter is getting invoked twice - so you need quadruple backslashes, not double:
var s = "document.getElementById('mainTabs').innerHTML='\\\\\\\\C\\\\ganye\\\\file.doc'";
This results in s getting set to:
document.getElementById('mainTabs').innerHTML='\\\\C\\ganye\\file.doc'
so the innerHTML gets set to:
\\C\ganye\file.doc
which is what you wanted. (I'm not sure I understand your reasons for needing eval(), but this is how to work around the problem if you do :-)
You need to quadruple the backslashes, because the the string literal is first interpreted by the JS parser, and then the result is again parsed due to the eval call.
Or, preferably, try to avoid using eval. It is almost never necessary and it adds complication and slows down execution.
This example would work as just:
document.getElementById('mainTabs').innerHTML='\\C\ganye\file.doc';
Related
I'm currently working on a small little dsl, not unlike rabl. I'm struggling with the implementation of one of my rules. Before we get to the problem, I'll explain a bit about my syntax/grammar.
In my little language you can define properties, object/array blocks, or custom blocks (these are all used to build a json object/array). A "custom block" can either be one that contains my standard expressions (property, object/array block, etc) or some JavaScript. These expressions are written as such -
-- An object block
object #model
-- A property node
property some, key(name="value")
-- A custom node
object custom_obj as
property some(name="key")
end
-- A custom script node
property full_name as (u)
// This is JavaScript
return u.first_name + ' ' + u.last_name;
end
end
The problem I'm running into is with my custom script node. I'm having a real hard defining the script token so that JISON can properly capture the stuff inside the block.
In my lexer, I currently have...
# script_param is basically a regex to match "(some_ident)"
{script_param} { this.begin('js'); return 'SCRIPT_PARAM'; }
<js>(.|\n|\r)*?"end" %{
this.popState();
yytext = yytext.substr(0, yyleng - 3).trim();
return 'SCRIPT';
%}
That SCRIPT token will basically match anything after (u) up to (and including) the end token (which usually ends a block). I really dislike this because my usual block terminator (end) is actually part of the script token, which feels totally hacky to me. Unfortunately, I'm not able to find a better way to capture ANYTHING between (..) and end.
I've tried writing a regex that captures anything that ends with a ";", but that poses problems when I have multiple script nodes in my dsl code. I've only been able to make this work by including the "end" keyword as part of my capture.
Here are the links to my grammar and lexer files.
I'd greatly appreciate any insight into solving my problem! If I didn't explain my problem clearly, let me know and I'll try my best to clarify!
Many thanks in advance!!
I will also happily accept any advice as to how to clean up my grammar. I'm still fairly new at this stuff and feel like my stuff is a mess right now :)
It's easy enough to match a string up to but not including the first instance of end:
([^e]|e[^n]|en[^d])*
(And it doesn't even need non-greedy repetition.)
However, that's not what you want. The included JavaScript might include:
variables whose names happen to include the characters end (tendency)
comments (/* Take the values up to the end of the line */)
character strings (if (word == "end"))
and, indeed, the word end itself, which is not a reserved word in js.
Really, the only clean solution is to be able to lex javascript. Fortunately, you don't have to do it precisely, because you're not interpreting it, but even so it is a bit of work. The most annoying part of javascript lexing, like other similar languages, is identifying when / is the beginning of a regular expression, and when it is just division; getting that right requires most of a javascript parser, particularly since it also interacts with the semicolon rule.
To deal with the fact that the included javascript might actually use a variable named end, you have a couple of choices, as far as I can see:
Document the fact that end is a reserved word.
Only recognize end when it appears outside of parentheses and in a place where a statement might start (not too difficult if you end up building enough of a JS parser to correctly identify regular expressions)
Only recognize end when it appears by itself on a line.
This last choice would really simplify your problem a lot, so you might want to think about it, although it's not really very elegant.
I have heard so many bad things about eval that I've never even tried to use it. However today I have a situation where it seems to be the right answer.
I need a script that can do simple calculations by combining variables. For example, if value=5 and max=8, I want to evaluate value*100/max. Both the values and the formulas will be retrieved from external sources, which is why I am concerned with eval.
I have set up a jsfiddle demo with some sample code:
http://jsfiddle.net/6yzgA/
The values are converted to numbers using parseFloat, so I believe I'm pretty safe here. The characters in the formula are matched again this regular expression:
regex=/[^0-9\.+-\/*<>!=&()]/, // allows numbers (including decimal), operations, comparison
My questions:
Does my regex filter protect me from any attack?
Is there any reason to use eval vs. new Function in this case?
Is there another, safer way to evaluate formulas?
Since you aren't sending anything sending anything to your server, or using anything on anyone else's system, the worst that can happen is that the user crashes his own browser, nothing more. There is nothing unsafe about using eval here, since everything happens user-side.
Escaping and preventing anything on the client-side doesn't make sense at all. User can alter any piece of JS code and run it just as easy as I can change the jsfiddle you posted. Trust me, it's just that simple and you cannot rely on the client-side security.
If you remember to escape input fields on the server-side it's nothing to be worried about. There are plenty of functions for that by default, depending on which language you're using.
If user wants to type in <script>haxx(l33t);</script> - let him do it. Just remember to escape special characters so you'll have <script>haxx(l33t);</script>.
I have an input onchange that converts numbers like 05008 to 5,008.00.
I am considering expanding on this, to allow simple calculations. For example, 45*5 would be converted automatically to 225.00.
I could use a character white-list ()+/*-0123456789., and then pass the result to eval, I think that these characters are safe to prevent any dangerous injections. That is assuming I use an appropriate try/catch, because a syntax error could be created.
Is this an OK white-list, and then pass it to eval?
Do recommend a revised white-list
Do you recommend a different approach (maybe there is already a function that does this)
I would prefer to keep it lightweight. That is why I like the eval/white-list approach. Very little code.
What do you recommend?
That whitelist looks safe to me, but it's not such a simple question. In some browsers, for example, an eval-string like this:
/.(.)/(34)
is equivalent to this:
new RegExp('.(.)').exec('34')
and therefore returns the array ['34','4']. Is that "safe"?
So while the approach can probably be made to work safely, it might be a very tricky proposition. If you do go forward with this idea, I think you should use a much more aggressive approach to validate your inputs. Your principle should be "this is a member of a well-defined set of strings that is known to be 'safe'" rather than "this is a member of an ill-defined set of strings that excludes all strings known to be 'unsafe'". Furthermore, to avoid any risk of operators peeking through that you hadn't considered (such as ++ or += or whatnot), I think you should insert a space in front of every non-digit-non-dot character; and to avoid any risk of parentheses triggering a function call, I think you should handle them yourself by repeatedly replacing (...) with a space plus the result of evaluating ... (after confirming that that result is a number) plus a space.
(By the way, how come = is in your whitelist? I just can't figure out what that's useful for!)
Given that extremely restrictive whitelist, I can't see any way of performing a malicious action beyond throwing an exception. The bracket trick won't work since it requires square brackets [].
Perhaps the safest option is to modify your page's default values parser to only accept numbers and throw out anything else. That way, potentially malicious code in a link will never make it to eval.
This only leaves the possibility of the user typing something malicious into a field, but why even bother worrying about that? The user already has access to a console (Dev Tools) they could use to execute arbitrary code.
An often overlooked issue with eval is that it causes problems for javascript minifiers.
Some minifiers like YUI take the safe route and stop renaming variables as soon as they see an eval statement. This means your javascript will work but your compressed file will be larger than it needs to be.
Other's like Google Closure Compiler will continue to rename variables but if you are not careful they can break your code. You should avoid passing strings with variable names in it to eval. so for example.
var input = "1+2*3";
var result = eval("input"); // unsafe
var result = eval(input); // safe
I want to extract javasscript code and find out if there are any dynamic tag creations like document.createElement('script'); I have tried to do this with Regular expressions but using regular expressions restricts me to get only some formats so i thought of writing a javascript parser which extracts all the keywords, strings and functions from the javascript code.
In general there is no way to know if a given line of code will ever run, you would need to solve the halting problem.
If you restrict your analysis to just finding occurances of a function call you don't make much progress. Naive methods will still be easy to trick, if you just regex match for document.createElement, you would not be able to match something as simple as document["create" + "Element"]. In general you would need to not only parse the code but evaluate it as well to get around this. And to be sure that you can evaluate the code you would again need to solve the halting problem.
Maybe you should try using Burrito
Well the first rule is never use regex for big things like this, or DOM, or ... . You have to parse it by tokens. The good news is that you don't have to write your own. There are a few JS to JS parsers.
UglifyJS
narcissus
Esprima
ZeParser
They may be a bit hard to work with it. But well better to work with them. There are other projects that are uses these such as burrito or code surgeon. So you can have a look at the source code and see how they uses them.
But there is bad news too, which people can still outsmart other people, let alone the parsers and the code they write. At least you need to evaluate the code with some execution time variables and see if it tries to access the DOM or not.
I'm using a thrid party javascript library that uses eval() so when i call one of it's functions with the "1e-1" value as a parameter i get 0.1 returned. How can i escape this or avoid it from parsing the number?
A basic example would be:
console.log(eval("1e-1"));
I want the result to be 1e-1, but eval still needs to be there.
EDIT:
Okay Ignore the console example above
THIS is the example it should work on:
There is no way around using this library. Sorry.
Dont use eval(). Of course, Number("1e-1") has the same "problem". However, if you want a string back from eval you have to feed it with one: eval("'1e-1'").
One quick way to do this is to simply replace the hyphen with it's Character Entity code instead:
console.log(eval("1e-1"));
Update
After experimenting for quite a while, the only thing that was close is placing spaces before and after the hyphen:
features[1].attributes.tag= "1e - 1";
I thought it worth mentioning incase this will suffice for what you need.