Expand an expression in javascript

Expand an expression in javascript - javascript

I have the following expression (string) in javascript:
var my_expr = "(2*(A||B&&C))&&(3*D)";
How do I make to get something like this:
(2*A||2*B&&2*C)&&(3*D)
Thanks for your help.

You have to parse the expression into an abstract expression tree. This is not the easiest task but neither is it impossible. Next thing, you'll have to implement some algebraic properties, especially distributive property, which you are showing here. Lastly, you have to express which rules to apply on which sub-tree of the expression.
Once you've done all this for the general case, it will be TRIVIAL to apply it to this and all similar cases you may encounter.
Primer on expression trees in Javascript
EDIT: You have to go through expression trees, there is no shortcut, because it's the only way algebraic properties can be applied correctly for all cases

It sounds like you want to eval() the string:
> var str = "4*5";
> eval(str);
20
There's countless resources all over the internet about why eval is a dangerous thing to use. This is particularly true if you're obtaining the string from another source. If there are any alternatives to this, I would recommend using them.

Related

Comma Operator to Semicolons

I have a chunk of javascript that has many comma operators, for example
"i".toString(), "e".toString(), "a".toString();
Is there a way with JavaScript to convert these to semicolons?
"i".toString(); "e".toString(); "a".toString();

This might seem like a cop-out answer... but I'd suggest against trying it. Doing any kind of string manipulation to change it would be virtually impossible. In addition to function definition argument lists, you'd also need to skip text in string literals or regex literals or function calls or array literals or object literals or variable declarations.... maybe even more. Regex can't handle it, turning on and off as you see keywords can't handle it.
If you want to actually convert these, you really have to actually parse the code and figure out which ones are the comma operator. Moreover, there might be some cases where the comma's presence is relevant:
var a = 10, 20;
is not the same as
var a = 10; 20;
for example.
So I really don't think you should try it. But if you do want to, I'd start by searching for a javascript parser (or writing one, it isn't super hard, but it'd probably take the better part of a day and might still be buggy). I'm pretty sure the more advanced minifiers like Google's include a parser, maybe their source will help.
Then, you parse it to find the actual comma expressions. If the return value is used, leave it alone. If not, go ahead and replace them with expression statements, then regenerate the source code string. You could go ahead and format it based on scope indentation at this time too. It might end up looking pretty good. It'll just be a fair chunk of work.
Here's a parser library written in JS: http://esprima.org/ (thanks to #torazaburo for this comment)

RegExp for parsing a Math Expression?

Hey I've written a fractal-generating program in JavaScript and HTML5 (here's the link), which was about a 2 year process including all the research I did on Complex math and fractal equations, and I was looking to update the interface, since it is quite intimidating for people to look at. While looking through the code I noticed that some of my old techniques for going about doing things were very inefficient, such as my Complex.parseFunction.
I'm looking for a way to use RegExp to parse components of the expression such as functions, operators, and variables, as well as implementing the proper order of operations for the expression. An example below might demonstrate what I mean:
//the first example parses an expression with two variables and outputs to string
console.log(Complex.parseFunction("i*-sinh(C-Z^2)", ["Z","C"], false))
"Complex.I.mult(Complex.neg(Complex.sinh(C.sub(Z.cPow(new Complex(2,0,2,0))))))"
//the second example parses the same expression but outputs to function
console.log(Complex.parseFunction("i*-sinh(C-Z^2)", ["Z","C"], true))
function(Z,C){
return Complex.I.mult(Complex.neg(Complex.sinh(C.sub(Z.cPow(new Complex(2,0,2,0))))));
}
I know how to handle RegExp using String.prototype.replace and all that, all I need is the RegExp itself. Please note that it should be able to tell the difference between the subtraction operator (e.g. "C-Z^2") and the negative function (e.g. "i*-(Z^2+C)") by noting whether it is directly after a variable or an operator respectively.

While you can use regular expressions as part of an expression parser, for example to break out tokens, regular expressions do not have the computational power to parse properly nested mathematical expressions. That is essentially one of the core results of computing theory (finite state automata vs. push down automata). You probably want to look at something like recursive-descent or LR parsing.
I also wouldn't worry too much about the efficiency of parsing an expression provided you only do it once. Given all of the other math you are doing, I doubt it is material.

JavaScript to evaluate simple math string like 5*1.2 (eval/white-list?)

I have an input onchange that converts numbers like 05008 to 5,008.00.
I am considering expanding on this, to allow simple calculations. For example, 45*5 would be converted automatically to 225.00.
I could use a character white-list ()+/*-0123456789., and then pass the result to eval, I think that these characters are safe to prevent any dangerous injections. That is assuming I use an appropriate try/catch, because a syntax error could be created.
Is this an OK white-list, and then pass it to eval?
Do recommend a revised white-list
Do you recommend a different approach (maybe there is already a function that does this)
I would prefer to keep it lightweight. That is why I like the eval/white-list approach. Very little code.
What do you recommend?

That whitelist looks safe to me, but it's not such a simple question. In some browsers, for example, an eval-string like this:
/.(.)/(34)
is equivalent to this:
new RegExp('.(.)').exec('34')
and therefore returns the array ['34','4']. Is that "safe"?
So while the approach can probably be made to work safely, it might be a very tricky proposition. If you do go forward with this idea, I think you should use a much more aggressive approach to validate your inputs. Your principle should be "this is a member of a well-defined set of strings that is known to be 'safe'" rather than "this is a member of an ill-defined set of strings that excludes all strings known to be 'unsafe'". Furthermore, to avoid any risk of operators peeking through that you hadn't considered (such as ++ or += or whatnot), I think you should insert a space in front of every non-digit-non-dot character; and to avoid any risk of parentheses triggering a function call, I think you should handle them yourself by repeatedly replacing (...) with a space plus the result of evaluating ... (after confirming that that result is a number) plus a space.
(By the way, how come = is in your whitelist? I just can't figure out what that's useful for!)

Given that extremely restrictive whitelist, I can't see any way of performing a malicious action beyond throwing an exception. The bracket trick won't work since it requires square brackets [].
Perhaps the safest option is to modify your page's default values parser to only accept numbers and throw out anything else. That way, potentially malicious code in a link will never make it to eval.
This only leaves the possibility of the user typing something malicious into a field, but why even bother worrying about that? The user already has access to a console (Dev Tools) they could use to execute arbitrary code.

An often overlooked issue with eval is that it causes problems for javascript minifiers.
Some minifiers like YUI take the safe route and stop renaming variables as soon as they see an eval statement. This means your javascript will work but your compressed file will be larger than it needs to be.
Other's like Google Closure Compiler will continue to rename variables but if you are not careful they can break your code. You should avoid passing strings with variable names in it to eval. so for example.
var input = "1+2*3";
var result = eval("input"); // unsafe
var result = eval(input); // safe

Writing a Parser for javascript code

I want to extract javasscript code and find out if there are any dynamic tag creations like document.createElement('script'); I have tried to do this with Regular expressions but using regular expressions restricts me to get only some formats so i thought of writing a javascript parser which extracts all the keywords, strings and functions from the javascript code.

In general there is no way to know if a given line of code will ever run, you would need to solve the halting problem.
If you restrict your analysis to just finding occurances of a function call you don't make much progress. Naive methods will still be easy to trick, if you just regex match for document.createElement, you would not be able to match something as simple as document["create" + "Element"]. In general you would need to not only parse the code but evaluate it as well to get around this. And to be sure that you can evaluate the code you would again need to solve the halting problem.

Maybe you should try using Burrito

Well the first rule is never use regex for big things like this, or DOM, or ... . You have to parse it by tokens. The good news is that you don't have to write your own. There are a few JS to JS parsers.
UglifyJS
narcissus
Esprima
ZeParser
They may be a bit hard to work with it. But well better to work with them. There are other projects that are uses these such as burrito or code surgeon. So you can have a look at the source code and see how they uses them.
But there is bad news too, which people can still outsmart other people, let alone the parsers and the code they write. At least you need to evaluate the code with some execution time variables and see if it tries to access the DOM or not.

Creating a Basic Formula Editor in JavaScript

I'm working on creating a basic RPG game engine prototype using JavaScript and canvas. I'm still working out some design specs on paper, and I've hit a bit of a problem I'm not quite sure how to tackle.
I will have a Character object that will have an array of Attribute objects. Attributes will look something like this:
function(name, value){
this.name = name;
this.value = value;
...
}
A Character will also have "skills" that are calculated off attributes. A skills value can also be determined by a formula entered by the user. A legit formula would look something like this:
((#attribute1Name + (#attribute2Name / 2) * 5)
where any text following the # sign represents the name of an attribute belonging to that character. The formula will be entered into a text field as a string.
What I'm having a problem with is understanding the proper way to parse and evaluate this formula. Initially, my plan was to do a simple replace on the attribute names and eval the expression (if invalid, the eval would fail). However, this presents a problem as it would allow for JavaScript injection into the field. I'm assuming I'll need some kind of FSM similar to an infix calculator to solve this, but I'm a little rusty on my computation theory (thanks corporate world!). I'm really not asking for someone to just hand me the code so much as I'd like to get your input on what is the best solution to this problem?
EDIT: Thanks for the responses. Unfortunately life has kept me busy and I haven't tried a solution yet. Will update when I get a result (good or bad).

Different idea, hence a separate suggestion:
eval() works fine, and there's no need to re-invent the wheel.
Assuming that there's only a small and fixed number of variables in your formula language, it would be sufficient to scan your way through the expression and verify that everything you encounter is either a parenthesis, an operator or one of your variable names. I don't think there would be any way to assemble those pieces into a piece of code that could have malicious side effects on eval.
So:
Scan the expression to verify that it draws from just a very limited vocabulary.
Let eval() work it out.
Probably the compromise with the least amount of work and code while bringing risk down to (near?) 0. At worst, a misuser could tack parentheses on a variable name in an attempt to execute the variable.

I think instead of letting them put the whole formula in, you could have select tags that have operations and values, and let them choose.
ie. a set of tags with attribute-operation-number:
<select> <select> <input type="text">
#attribute1Name1 + (check if input is number)
#attribute1Name2 -
#attribute1Name3 *
#attribute1Name4 /
etc.

There is a really simple solution: Just enter a normal JavaScript formula (i.e. as if you were writing a method for your object) and use this to reference the object you're working on.
To change this when evaluating the method use apply() or call() (see this answer).

I recently wrote a similar application. I probably invested far too much work, but I went the whole 9 yards and wrote both a scanner and a parser.
The scanner converted the text into a series of tokens; tokens are simple objects consisting of token type and value. For the punctuation marks, value = character, for numbers the values would be integers corresponding to the numeric value of the number, and for variables it would be (a reference to) a variable object, where that variable would be sitting in a list of objects having a name. Same variable object = same variable, natch.
The parser was a simple brute force recursive descent parser. Here's the code.
My parser does logic expressions, with AND/OR taking the place of +/-, but I think you can see the idea. There are several levels of expressions, and each tries to assemble as much of itself as it can, and calls to lower levels for parsing nested constructs. When done, my parser has generated a single Node containing a tree structure that represents the expression.
In your program, I guess you could just store that Node, as its structure will essentially represent the formula for its evaluation.
Given all that work, though, I'd understand just how tempting it would be to just cave in and use eval!

I'm fascinated by the task of getting this done by the simplest means possible.
Here's another approach:
Convert infix to postfix;
use a very simple stack-based calculator to evaluate the resulting expression.
The rationale here being, once you get rid of the complication of "* before +" and parentheses, the remaining calculation is very straightforward.

You could look at running the user-defined code in a sandbox to prevent attacks:
Is It Possible to Sandbox JavaScript Running In the Browser?

Develop Reference

JavaScript is the programming language of the Web.