Comma Operator to Semicolons - javascript

I have a chunk of javascript that has many comma operators, for example
"i".toString(), "e".toString(), "a".toString();
Is there a way with JavaScript to convert these to semicolons?
"i".toString(); "e".toString(); "a".toString();

This might seem like a cop-out answer... but I'd suggest against trying it. Doing any kind of string manipulation to change it would be virtually impossible. In addition to function definition argument lists, you'd also need to skip text in string literals or regex literals or function calls or array literals or object literals or variable declarations.... maybe even more. Regex can't handle it, turning on and off as you see keywords can't handle it.
If you want to actually convert these, you really have to actually parse the code and figure out which ones are the comma operator. Moreover, there might be some cases where the comma's presence is relevant:
var a = 10, 20;
is not the same as
var a = 10; 20;
for example.
So I really don't think you should try it. But if you do want to, I'd start by searching for a javascript parser (or writing one, it isn't super hard, but it'd probably take the better part of a day and might still be buggy). I'm pretty sure the more advanced minifiers like Google's include a parser, maybe their source will help.
Then, you parse it to find the actual comma expressions. If the return value is used, leave it alone. If not, go ahead and replace them with expression statements, then regenerate the source code string. You could go ahead and format it based on scope indentation at this time too. It might end up looking pretty good. It'll just be a fair chunk of work.
Here's a parser library written in JS: http://esprima.org/ (thanks to #torazaburo for this comment)

Related

In a stringified array is it possible to differentiate between quotes that were in a string and those that surrounded the string itself?

Some Context:
• I'm still learning to code atm (started less than a year ago)
• I'm mostly self taught at that since I think my computer science class feels
too slow.
• The website I'm learning on is code.org, specifically in the "game lab"
• The site's coding environments only use ES5 because they don't want to
update them to ES6 or something like that
• In class we're making function libraries and while not required, I want
mine to be "highly usable," for lack of a better term, while also being
reasonably short (prefer not to automate things if I can get them done
quicker somehow, but that's just personal preference).
So now for where the actual question comes in: in a stringified array, is it possible to differentiate between a quotation mark that was inside a string and a quotation mark that actually denotes a string? Because I noticed something confusing with the output of JSON.parse(JSON.stringify()) on code.org, specifically, if you write something like,
JSON.parse(JSON.stringify(['hi","hi']))
the output will be ["hi","hi"] which looks just like an array containing two strings (on code.org it doesn't show the \'s), but still contains just one, which is fine unless you're using a regular expression to detect whether or not a match is within a string (if every quotation mark after the match has a "partner"), which is what I'm doing in 4 different functions. One flattens a list (since ES5 doesn't have Array.prototype.flat()), one removes all instances of the arguments from a list, one removes all instances of specified operand types, and one replaces all instances of an argument with the one that follows it.
Now I know the odds of a string containing an odd number of quotation marks (whether single or double) is likely extremely low, but it still bothers me that not having a way to differentiate between quotes formerly within a string and quotes which formerly denoted a string (in an array after it's been stringified) as these functions otherwise function exactly as intended. The regular expression I'm using to determine if there's an even number of quotes left in the stringified array is /(?=[^"]*(?:(?:"[^"]*){2})*$)/ where you put the match before the lookahead assertion and anything you absolutely want to follow before the first [^"]*.
To highlight the actual issue I'm trying to solve, this is my flatten function (since it's the shortest of the 4), and yeah, yeah, I know "eval bad" but it's extremely convenient to use here since it shortens the actual modification into a single line, and I highly doubt anyone's actually going to find a way to abuse it given its implementation ("this" needs to be an array for splice to work, so if I'm not mistaken, there isn't really a way to abuse it, but tell me if I'm wrong, since I probably am).
Array.prototype.flatten = function() {
eval(('this.splice(0,this.length,' + JSON.stringify(this).replace(/[\[\]](?=[^"]*(?:(?:"[^"]*){2})*$)/g, '') + ')').replace(/,(?=((,[^"]*(?:(?:"[^"]*){2})*)*.$))/g, ''));
return this;
};
This works really well outside of the previously specified conditions, but if I were to call it with something like [1,'"'] it'd find 3 quotation marks after the \[ and wouldn't be able to remove it but would be able to remove the \], thus when eval actually gets to .splice(), it would look like eval('this.splice(0,this.length,[1,"\"")') causing the error Unexpected token ')' to be thrown
Any help on this is appreciated, even if it's just telling me it isn't possible, thanks for reading my ramblings.
TL;DR: in a stringified array is it possible to differentiate between " and \" (string wrapping quotes of strings within a stringified array and quotes within a string within a stringified array) in a regular expression or any other method using only the tools available in ES5 (site I'm learning on doesn't want to update their project environments for whatever reason)
You are having a problem because your input is not a context free grammar and can not be correctly parsed with regular expressions.
Can you explain why JSON.parse is unacceptable? It is even in ancient browsers and versions of node.js.
Someone writing a json parser might use bison or yacc, so if this is a learning experience consider playing with jison.
I ended up finding a way to do this, for whatever reason (either I didn't notice last night because I was tired or it legitimately changed overnight, though likely the former) I can now see the " when viewing the value of the the stringified array, and lo and behold modifying the regular expression so that it ignored instances of " resolved the issue.
New regular expression for quotation mark pair matching now reads:
// old even number of quotation marks after match check
/(?=[^"]*(?:(?:"[^"]*){2})*$)/
// new even number of quotation marks after match check
/(?=(\\"|[^"])*(?:(?:(?<!\\)"(\\"|[^"])*){2})*$)/
// (only real difference is that it accounts for the \)
Sorry for anyone who may have misunderstood the question due to how all over the place it was, I'm aware that I tend to end up writing a lot more than is necessary and it often leads to tangents that muddle my view of what I was initially asking, which in turn makes the point I'm actually trying to get across even harder to grasp at. Thanks to those who still tried to help me regardless of how much of a mess of a first question this was.

Java Regex replace function not working as intended

I need some help with a JS Regex.
Here's the string I'm passing, I want to delete everything before 'Hanyuu-sama' with JS Replace.
Hanyuu","dj":{"id":18,"djname":"Hanyuu-sama
The first and second "Hanyuu" can change, the id number can change. This has already been cropped quite a bit with regular expressions.
Now I've tried a few and surprisingly it's failing when I do simple and complex regexes:
I've tried:
.*\"
And it does nothing, I've tried disgusting stuff in my desperation:
.*\","dj\":{\"id":.*,\"djname\":\"
And nada.
Here's a JS Fiddle and here's a http://regex101.com/r/tE2uY0/1 Regex JS matching platform.
Does anyone know why this isn't working?
I know this is likely bad practice, I'm just trying to learn Regexes.
Bonus points if anyone can refer me to a good source to learn Regular expressions. I'd love a solution but I'd like to learn how to do this myself in the future and why this one failed even more.
Your method call should look like this:
source = source.replace(/.*"/, "");
Regular expression in javascript are written between /.../ and not "/.../" like they are in many other languages.
If your string is always structured like that and it does not contain any more characters, your regex should do the trick. That's because the * quantifier acts greedy by default, thus always matching the last " in the string.

Jison / Flex: Trying to capture anything (.*) between two tokens but having problems

I'm currently working on a small little dsl, not unlike rabl. I'm struggling with the implementation of one of my rules. Before we get to the problem, I'll explain a bit about my syntax/grammar.
In my little language you can define properties, object/array blocks, or custom blocks (these are all used to build a json object/array). A "custom block" can either be one that contains my standard expressions (property, object/array block, etc) or some JavaScript. These expressions are written as such -
-- An object block
object #model
-- A property node
property some, key(name="value")
-- A custom node
object custom_obj as
property some(name="key")
end
-- A custom script node
property full_name as (u)
// This is JavaScript
return u.first_name + ' ' + u.last_name;
end
end
The problem I'm running into is with my custom script node. I'm having a real hard defining the script token so that JISON can properly capture the stuff inside the block.
In my lexer, I currently have...
# script_param is basically a regex to match "(some_ident)"
{script_param} { this.begin('js'); return 'SCRIPT_PARAM'; }
<js>(.|\n|\r)*?"end" %{
this.popState();
yytext = yytext.substr(0, yyleng - 3).trim();
return 'SCRIPT';
%}
That SCRIPT token will basically match anything after (u) up to (and including) the end token (which usually ends a block). I really dislike this because my usual block terminator (end) is actually part of the script token, which feels totally hacky to me. Unfortunately, I'm not able to find a better way to capture ANYTHING between (..) and end.
I've tried writing a regex that captures anything that ends with a ";", but that poses problems when I have multiple script nodes in my dsl code. I've only been able to make this work by including the "end" keyword as part of my capture.
Here are the links to my grammar and lexer files.
I'd greatly appreciate any insight into solving my problem! If I didn't explain my problem clearly, let me know and I'll try my best to clarify!
Many thanks in advance!!
I will also happily accept any advice as to how to clean up my grammar. I'm still fairly new at this stuff and feel like my stuff is a mess right now :)
It's easy enough to match a string up to but not including the first instance of end:
([^e]|e[^n]|en[^d])*
(And it doesn't even need non-greedy repetition.)
However, that's not what you want. The included JavaScript might include:
variables whose names happen to include the characters end (tendency)
comments (/* Take the values up to the end of the line */)
character strings (if (word == "end"))
and, indeed, the word end itself, which is not a reserved word in js.
Really, the only clean solution is to be able to lex javascript. Fortunately, you don't have to do it precisely, because you're not interpreting it, but even so it is a bit of work. The most annoying part of javascript lexing, like other similar languages, is identifying when / is the beginning of a regular expression, and when it is just division; getting that right requires most of a javascript parser, particularly since it also interacts with the semicolon rule.
To deal with the fact that the included javascript might actually use a variable named end, you have a couple of choices, as far as I can see:
Document the fact that end is a reserved word.
Only recognize end when it appears outside of parentheses and in a place where a statement might start (not too difficult if you end up building enough of a JS parser to correctly identify regular expressions)
Only recognize end when it appears by itself on a line.
This last choice would really simplify your problem a lot, so you might want to think about it, although it's not really very elegant.

RegExp for parsing a Math Expression?

Hey I've written a fractal-generating program in JavaScript and HTML5 (here's the link), which was about a 2 year process including all the research I did on Complex math and fractal equations, and I was looking to update the interface, since it is quite intimidating for people to look at. While looking through the code I noticed that some of my old techniques for going about doing things were very inefficient, such as my Complex.parseFunction.
I'm looking for a way to use RegExp to parse components of the expression such as functions, operators, and variables, as well as implementing the proper order of operations for the expression. An example below might demonstrate what I mean:
//the first example parses an expression with two variables and outputs to string
console.log(Complex.parseFunction("i*-sinh(C-Z^2)", ["Z","C"], false))
"Complex.I.mult(Complex.neg(Complex.sinh(C.sub(Z.cPow(new Complex(2,0,2,0))))))"
//the second example parses the same expression but outputs to function
console.log(Complex.parseFunction("i*-sinh(C-Z^2)", ["Z","C"], true))
function(Z,C){
return Complex.I.mult(Complex.neg(Complex.sinh(C.sub(Z.cPow(new Complex(2,0,2,0))))));
}
I know how to handle RegExp using String.prototype.replace and all that, all I need is the RegExp itself. Please note that it should be able to tell the difference between the subtraction operator (e.g. "C-Z^2") and the negative function (e.g. "i*-(Z^2+C)") by noting whether it is directly after a variable or an operator respectively.
While you can use regular expressions as part of an expression parser, for example to break out tokens, regular expressions do not have the computational power to parse properly nested mathematical expressions. That is essentially one of the core results of computing theory (finite state automata vs. push down automata). You probably want to look at something like recursive-descent or LR parsing.
I also wouldn't worry too much about the efficiency of parsing an expression provided you only do it once. Given all of the other math you are doing, I doubt it is material.

How do i avoid eval() from converting 1e-1 to 0.1?

I'm using a thrid party javascript library that uses eval() so when i call one of it's functions with the "1e-1" value as a parameter i get 0.1 returned. How can i escape this or avoid it from parsing the number?
A basic example would be:
console.log(eval("1e-1"));
I want the result to be 1e-1, but eval still needs to be there.
EDIT:
Okay Ignore the console example above
THIS is the example it should work on:
There is no way around using this library. Sorry.
Dont use eval(). Of course, Number("1e-1") has the same "problem". However, if you want a string back from eval you have to feed it with one: eval("'1e-1'").
One quick way to do this is to simply replace the hyphen with it's Character Entity code instead:
console.log(eval("1e-1"));
Update
After experimenting for quite a while, the only thing that was close is placing spaces before and after the hyphen:
features[1].attributes.tag= "1e - 1";
I thought it worth mentioning incase this will suffice for what you need.

Categories

Resources