Javascript string literal syntax rules - javascript

Could somebody please explain why
const btn1 = document.querySelector('input[id="btn"]')
requires me to use ('input[id="btn"]') and not ("input[id="btn"]") or ('input[id='btn']').

Because the JavaScript engine needs to be able to unambiguously determine where a string literal begins and ends.
If an unescaped character which is the same as the string delimiter was permitted inside the string, how would the interpreter determine whether the character was there to terminate the string, or if it was to be interpreted as a literal ' (or ") to be part of the string?
Rigorous syntactical rules are required for the unambiguous evaluation of JavaScript source text to be possible. (But string escaping is pretty trivial, and common to most all programming languages.) Learn it once, for any language, and you'll probably be well suited for understanding how it can work in many other languages. In JS, it's really not that hard compared to many more complicated constructs (like async/await).

You can remove quotes on attribute selector to prevent a stupid formatting rules:
const btn1 = document.querySelector('input[id=btn]')

Related

Confusion regarding RegExp matches, HTML tags, and newlines

I am attempting to create a Markdown-to-HTML parser. I am trying to use regex expressions to match an input string that may or may not contain HTML tags and whitespace/newlines. I have encountered an interesting case that I do not at all understand.
My regex expression is regex = /\*([\w\s]+|<.+>)\*/g.
The following works:
'*words\nmorewords*'.match(regex)
'*<b>words</b>*'.match(regex)
However, this does not work:
'*<b>words\nmore words</b>*'.match(regex)
If anyone can help me understand why this is so, I would appreciate it.
Edit: I see my faulty logic, thanks to Ry. The expression regex = /\*(<[a-z]+>)?[\w\s]+(<\/[a-z]+>)?\*/g solves this case.
This should work for your purpose:
\*(<.+>)?([\w\s]+)(<.+>)?\*
The HTML tags can exist or not (<.+>)?. The \n is matched by the \s (whitespace).
I'm also going to link the canonical don't parse HTML with regex answer, because regex is not suitable for (or even capable of) parsing HTML beyond fairly restricted subsets. Have a read, it's informative (and funny)!
Recall the Chomsky Heirarchy. Regular expressions can parse regular languages. HTML is not a regular language (it is the next level up, context sensitive).
There are extensions to some regular expression engines that give it recursive capability. You can probably parse HTML with these but there are better ways, like using a proper HTML parser for example DOMParser.

Are there drawbacks to using backticks for all string operations in Typescript? [duplicate]

Is there a reason (performance or other) not to use backtick template literal syntax for all strings in a javascript source file? If so, what?
Should I prefer this:
var str1 = 'this is a string';
over this?
var str2 = `this is another string`;
Code-wise, there is no specific disadvantage. JS engines are smart enough to not have performance differences between a string literal and a template literal without variables.
In fact, I might even argue that it is good to always use template literals:
You can already use single quotes or double quotes to make strings. Choosing which one is largely arbitrary, and you just stick with one. However, it is encouraged to use the other quote if your string contains your chosen string marker, i.e. if you chose ', you would still do "don't argue" instead of 'don\'t argue'. However, backticks are very rare in normal language and strings, so you would actually more rarely have to either use another string literal syntax or use escape codes, which is good.
For example, you'd be forced to use escape sequences to have the string she said: "Don't do this!" with either double or single quotes, but you wouldn't have to when using backticks.
You don't have to convert if you want to use a variable in the string in the future.
However, those are very weak advantages. But still more than none, so I would mainly use template literals.
A real but in my opinion ignorable objection is the one of having to support environments where string literals are not supported. If you have those, you would know and wouldn't be asking this question.
The most significant reason not to use them is that ES6 is not supported in all environments.
Of course that might not affect you at all, but still: YAGNI. Don't use template literals unless you need interpolation, multiline literals, or unescaped quotes and apostrophes. Much of the arguments from When to use double or single quotes in JavaScript? carry over as well. As always, keep your code base consistent and use only one string literal style where you don't need a special one.
Always use template literals. In this case YAGNI is not correct. You absolutely will need it. At some point, you will have add a variable or new line to your string, at which point you will either need to change single quotes to backticks, or use the dreaded '+'.
Be careful when the values are for external use. We work with Tealium for marketing analysis, and it currently does not support ES6 template literals. Event data containing template literals aka string templates will cause the Tealium script to error.
I'm fairly convinced by other answers that there's no serious downside to using them exclusively, but one additional counterpoint is that template strings are also used in advanced "tagged template" syntax, and as illustrated in this Reddit comment, if you try to rely exclusively on JavaScript's automatic semicolon insertion or just forget to include a semicolon, you can run into parsing issues with statements that begin with a template string.
// OK (single (or double) quotes)
logger = console.log
'123'.split('').forEach(logger)
// OK (semicolon)
logger = console.log;
`123`.split('').forEach(logger)
// Not OK
logger = console.log
`123`.split('').forEach(logger) // Error

Comma Operator to Semicolons

I have a chunk of javascript that has many comma operators, for example
"i".toString(), "e".toString(), "a".toString();
Is there a way with JavaScript to convert these to semicolons?
"i".toString(); "e".toString(); "a".toString();
This might seem like a cop-out answer... but I'd suggest against trying it. Doing any kind of string manipulation to change it would be virtually impossible. In addition to function definition argument lists, you'd also need to skip text in string literals or regex literals or function calls or array literals or object literals or variable declarations.... maybe even more. Regex can't handle it, turning on and off as you see keywords can't handle it.
If you want to actually convert these, you really have to actually parse the code and figure out which ones are the comma operator. Moreover, there might be some cases where the comma's presence is relevant:
var a = 10, 20;
is not the same as
var a = 10; 20;
for example.
So I really don't think you should try it. But if you do want to, I'd start by searching for a javascript parser (or writing one, it isn't super hard, but it'd probably take the better part of a day and might still be buggy). I'm pretty sure the more advanced minifiers like Google's include a parser, maybe their source will help.
Then, you parse it to find the actual comma expressions. If the return value is used, leave it alone. If not, go ahead and replace them with expression statements, then regenerate the source code string. You could go ahead and format it based on scope indentation at this time too. It might end up looking pretty good. It'll just be a fair chunk of work.
Here's a parser library written in JS: http://esprima.org/ (thanks to #torazaburo for this comment)

JavaScript function to escape Java regular expression string

Earlier questions on StackOverflow discuss escaping for JavaScript regular expressions, e.g.:
How to escape regular expression in javascript?
Escape string for use in Javascript regex
An implementation suggested is the following:
RegExp.quote = function(str) {
return str.replace(/([.?*+^$[\]\\(){}|-])/g, "\\$1");
};
Given that regular expressions in the two languages are not identical, is anyone aware of a JavaScript method that properly escapes strings to be used for Java regular expressions?
There's no need for any escaping at all. Those questions are about what needs to be done when the regular expression is being constructed as a string in the source language. Since you're reading the string from an input field, there's no layer of interpretation to worry about.
Just send the string to the server, where it will be discovered to be a valid regex or not.
edit — though I can't think of any, the real thing to worry about might be any sort of "injection" attack that could be conducted through this avenue. Seems to me that if you're just passing a regex to Pattern.compile() there aren't any side-effect channels that could be exploited.

RegExp for parsing a Math Expression?

Hey I've written a fractal-generating program in JavaScript and HTML5 (here's the link), which was about a 2 year process including all the research I did on Complex math and fractal equations, and I was looking to update the interface, since it is quite intimidating for people to look at. While looking through the code I noticed that some of my old techniques for going about doing things were very inefficient, such as my Complex.parseFunction.
I'm looking for a way to use RegExp to parse components of the expression such as functions, operators, and variables, as well as implementing the proper order of operations for the expression. An example below might demonstrate what I mean:
//the first example parses an expression with two variables and outputs to string
console.log(Complex.parseFunction("i*-sinh(C-Z^2)", ["Z","C"], false))
"Complex.I.mult(Complex.neg(Complex.sinh(C.sub(Z.cPow(new Complex(2,0,2,0))))))"
//the second example parses the same expression but outputs to function
console.log(Complex.parseFunction("i*-sinh(C-Z^2)", ["Z","C"], true))
function(Z,C){
return Complex.I.mult(Complex.neg(Complex.sinh(C.sub(Z.cPow(new Complex(2,0,2,0))))));
}
I know how to handle RegExp using String.prototype.replace and all that, all I need is the RegExp itself. Please note that it should be able to tell the difference between the subtraction operator (e.g. "C-Z^2") and the negative function (e.g. "i*-(Z^2+C)") by noting whether it is directly after a variable or an operator respectively.
While you can use regular expressions as part of an expression parser, for example to break out tokens, regular expressions do not have the computational power to parse properly nested mathematical expressions. That is essentially one of the core results of computing theory (finite state automata vs. push down automata). You probably want to look at something like recursive-descent or LR parsing.
I also wouldn't worry too much about the efficiency of parsing an expression provided you only do it once. Given all of the other math you are doing, I doubt it is material.

Categories

Resources