This question already has answers here:
What are the rules for JavaScript's automatic semicolon insertion (ASI)?
(7 answers)
Closed 6 years ago.
What's so terribly wrong about the following code
var obj = {a:1}
[1,2,3].forEach(console.log)
that Babel produces this malfunctional concoction?
var obj = { a: 1 }[(1, 2, 3)].forEach(console.log);
for any of ES201[567] preset. Verified with https://babeljs.io/repl.
Yes, terminating the first line with semicolon helps. Is it still a dangerous practice to not use semicolons?
UPDATE: before downvoting on the grounds of "gosh of course you should use semicolons", please acknowledge the evolution of not using them, along with arrow functions and ever growing toybox of array/object prototype functions. All modern JS examples do that, and ES6 newbie gets confused easily.
The problem is eluded to in the code compiled by Babel: the two lines you have given it are interpreted as one.
Is it still a dangerous practice to not use semicolons?
In my opinion, no. White-space is collapsed in JavaScript meaning multiple lines can be interpreted as one, so there are a few cases in which they are necessary, but whether this means that using semi-colons is an absolute must is really down to personal preference. Some people feel very strongly that you should, and they may well be right...
Anyhow, as you discovered, one of these cases is ending a line with an object or array and starting the next with an array literal ([]). You can read about the remaining cases here.
If the code you are working on does not use semi-colons, for example if it uses the standard code style, you can..
..add a semi-colon to the end of the first line:
var obj = {a:1};
[1,2,3].forEach(console.log)
..add a semi-colon to the beginning of the second line:
var obj = {a:1}
;[1,2,3].forEach(console.log) // beginning of last line
..avoid having to use a semi-colon at all by assigning the array literal to a variable:
var obj = {a:1}
var arr = [1,2,3]
arr.forEach(console.log)
____
Side note...
The reason the contents of your array literal are wrapped in brackets is because Babel has interpreted your array literal as an attempt to access a property on the object literal defined in the previous line. Of course, you can only access one property at a time using the array[index] syntax, so Babel uses the comma operator to take a single value from (1, 2, 3), which will be 3.
Read about the comma operator on MDN.
Related
In python _ is used as a throwaway variable. Is there an idiomatic accepted name in javascript?
Rigth now, you can use array destructuring, no need for a var.
https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Operators/Destructuring_assignment
For example:
[,b] = [1,2];
console.log(b);
will output :
2
And the value 1 won't be assigned to any unused var.
There isn't an accepted throwaway variable naming in JavaScript like in Python. Although, generally in the software engineering world, engineers tend to call those variables:
dummy, temp, trash, black_hole, garbage.
Use these variable namings or whatever name that will imply to the engineer reading your code that these variables have no specific use.
With that in mind, I would strongly not recommend using signs such as below as a throwaway variable:
_, $
Unless there is an explicit convention like in Python, don't use these signs as a throwaway variable, otherwise, it may interfere with other naming conventions and libraries in other languages, like the underscore and the jQuery libraries in JS.
And of course, if you find a specific use for that variable later on, make sure to rename it.
Generally, you will declare the signature of a function as an array. Simply omit one of them
const foo = (bar,baz,) => bar+baz;
foo(3,4,5) // → 7
This obviously works as long as the declaration of function argument is the last one; unlike this answer, which is valid for already-declared variables embedded in arrays. This:
const foo = (, bar,baz) => bar+baz;
will fail with:
Uncaught SyntaxError: expected expression, got ','
You can hack your way into this using any expression. However, you'll need to use the self same expression in every invocation
const quux = ([], bar,baz) => bar+baz;
quux(42,3,4); // → Uncaught TypeError: (destructured parameter) is not iterable
quux([],3,4); // → 7
That constant can be anything, so you can as well call it "dummy", but you will still have to repeat in every invocation.
var a = [1, 2, 3, 4];
var b = [10, 20, 30, 40];
console.log([a, b].length)
[a, b].some(function(x) {
x.push(x.shift())
});
I was extremely surprised today when this code caused
[a,b].some(function(x){ x.push(x.shift()) });
^
TypeError: Cannot call method 'some' of undefined
Obviously the JavaScript 'auto semicolon insertion' is not working as expected here. But why?
I know you might recommend to use ; everywhere to avoid something like that, but the question is not about whether it is better to use ; or not. I would love to know what exactly happens here?
When I'm worried about semicolon insertion, I think about what the lines in question would look like without any whitespace between them. In your case, that would be:
console.log([a,b].length)[a,b].some(function(x){ etc });
Here you're telling the Javascript engine to call console.log with the length of [a,b], then to look at index [a,b] of the result of that call.
console.log returns a string, so your code will attempt to find property b of that string, which is undefined, and the call to undefined.some() fails.
It's interesting to note that str[a,b] will resolve to str[b] assuming str is a string. As Kamil points out, a,b is a valid Javascript expression, and the result of that expression is simply b.
In general, one could say that implicit semi-colon's can easily fail when defining an array on a new line, because an array defined on a new line is interpreted as a property access of the value of the expression on the previous line.
Javascript does only consider new lines to mark the end of a statement if not ending the statement after this new line would cause a parse error. See What are the rules for JavaScript's automatic semicolon insertion (ASI)? and EcmaScript 5 spec for the exact rules. (Thanks to Rob W and limelights)
What happens is the following:
The code get interpreted as
console.log([a,b].length)[a,b].some(function(x){ x.push(x.shift()) });
i.e. all as one statement.
Now parse the statement:
some is called on the value of console.log([a,b].length)[a,b]
the value of console.log([a,b].length)[a,b] is computed by taking the returned value of console.log([a,b].length) (undefined) and then trying to access the property with the name of the value of a,b.
a,b evaluates to the value of b (try it in your console). There's no property with the value of b of undefined, so the resulting value will be undefined as well.
There's no method some on undefined, hence the error.
JavaScript doesn't treat every line break as a semicolon. It usually treats line
breaks as semicolons only if it can’t parse the code without the semicolons. Basically, JavaScript treats a line break as a semicolon if the next non-space character cannot be interpreted as a continuation of the current statement. JavaScript - The Definitive Guide: 6th Ed. section 2.4
So, in your case, it is interpreting the line as something like
console.log([a,b].length)[a,b].some(function(x){ x.push(x.shift()) });
And that is the reason for error. JavaScript is trying to perform array-access on the results of console.log([a,b].length). Depending on the JavaScript engine and the return value of console.log, you might get different errors.
If it is the last statement of the function or flow, you can avoid ';' but it is recommended to put ';' at the end of the each statement to avoid such error.
This question already has answers here:
Why is arr = [] faster than arr = new Array?
(5 answers)
Closed 10 years ago.
Today i was going through some posts in stackoverflow and this reply just popped up.
https://stackoverflow.com/a/2280350/548591
https://stackoverflow.com/a/11513602/548591
var name = [];
var name = new Array();
Is the literal one better in terms of performance than initializing an new Array Object.
Was reading this article now, just wanted to update.
http://yuiblog.com/blog/2006/11/13/javascript-we-hardly-new-ya/
From working on my own direct implementation of ECMA Grammar to make a parser, I can tell you this:
The Array literal [] gets parsed directly and then converted into an array object, whereas new Array() gets parsed first as a "expression", then checked for the new keyword, then what you want to create (in this case Array) and is then evaluated.
I can't tell you exactly how much performance is lost by using new Array(), it varies by the Javascript engine. V8 (Chrome) for example, pre-compiles code to optimize it, so it might get converted into a literal [] anyway, depending on how it works.
The easiest way would be to make a test function that creates a few hundred thousand arrays and measures the time for the loop with the literal declaration or the constructor initialization respectively.
Yeah, according to the tests, initialising [] is much faster than using the new Array. Besides that, the literal version si much more readable.
And this has already been asked!
Depending on the browsers, Literals appear to be up to twice as fast. That is, in browsers where there's a significant difference.
I stumbled upon that performance test, saying that RegExps in JavaScript are not necessarily slow: http://jsperf.com/regexp-indexof-perf
There's one thing i didn't get though: two cases involve something that i believed to be exactly the same:
RegExp('(?:^| )foo(?: |$)').test(node.className);
And
/(?:^| )foo(?: |$)/.test(node.className);
In my mind, those two lines were exactly the same, the second one being some kind of shorthand to create a RegExp object. Still, it's twice faster than the first.
Those cases are called "dynamic regexp" and "inline regexp".
Could someone help me understand the difference (and the performance gap) between these two?
Nowadays, answers given here are not entirely complete/correct.
Starting from ES5, the literal syntax behavior is the same as RegExp() syntax regarding object creation: both of them creates a new RegExp object every time code path hits an expression in which they are taking part.
Therefore, the only difference between them now is how often that regexp is compiled:
With literal syntax - one time during initial code parsing and
compiling
With RegExp() syntax - every time new object gets created
See, for instance, Stoyan Stefanov's JavaScript Patterns book:
Another distinction between the regular expression literal and the
constructor is that the literal creates an object only once during
parse time. If you create the same regular expression in a loop, the
previously created object will be returned with all its properties
(such as lastIndex) already set from the first time. Consider the
following example as an illustration of how the same object is
returned twice.
function getRE() {
var re = /[a-z]/;
re.foo = "bar";
return re;
}
var reg = getRE(),
re2 = getRE();
console.log(reg === re2); // true
reg.foo = "baz";
console.log(re2.foo); // "baz"
This behavior has changed in ES5 and the literal also creates new objects. The behavior has also been corrected in many browser
environments, so it’s not to be relied on.
If you run this sample in all modern browsers or NodeJS, you get the following instead:
false
bar
Meaning that every time you're calling the getRE() function, a new RegExp object is created even with literal syntax approach.
The above not only explains why you shouldn't use the RegExp() for immutable regexps (it's very well known performance issue today), but also explains:
(I am more surprised that inlineRegExp and storedRegExp have different
results.)
The storedRegExp is about 5 - 20% percent faster across browsers than inlineRegExp because there is no overhead of creating (and garbage collecting) a new RegExp object every time.
Conclusion:
Always create your immutable regexps with literal syntax and cache it if it's to be re-used. In other words, don't rely on that difference in behavior in envs below ES5, and continue caching appropriately in envs above.
Why literal syntax? It has some advantages comparing to constructor syntax:
It is shorter and doesn’t force you to think in terms of class-like
constructors.
When using the RegExp() constructor, you also need to escape quotes and double-escape backslashes. It makes regular expressions
that are hard to read and understand by their nature even more harder.
(Free citation from the same Stoyan Stefanov's JavaScript Patterns book).
Hence, it's always a good idea to stick with the literal syntax, unless your regexp isn't known at the compile time.
The difference in performance is not related to the syntax that is used is partly related to the syntax that is used: in /pattern/ and RegExp(/pattern/) (where you did not test the latter) the regular expression is only compiled once, but for RegExp('pattern') the expression is compiled on each usage. See Alexander's answer, which should be the accepted answer today.
Apart from the above, in your tests for inlineRegExp and storedRegExp you're looking at code that is initialized once when the source code text is parsed, while for dynamicRegExp the regular expression is created for each invocation of the method. Note that the actual tests run things like r = dynamicRegExp(element) many times, while the preparation code is only run once.
The following gives you about the same results, according to another jsPerf:
var reContains = /(?:^| )foo(?: |$)/;
...and
var reContains = RegExp('(?:^| )foo(?: |$)');
...when both are used with
function storedRegExp(node) {
return reContains.test(node.className);
}
Sure, the source code of RegExp('(?:^| )foo(?: |$)') might first be parsed into a String, and then into a RegExp, but I doubt that by itself will be twice as slow. However, the following will create a new RegExp(..) again and again for each method call:
function dynamicRegExp(node) {
return RegExp('(?:^| )foo(?: |$)').test(node.className);
}
If in the original test you'd only call each method once, then the inline version would not be a whopping 2 times faster.
(I am more surprised that inlineRegExp and storedRegExp have different results. This is explained in Alexander's answer too.)
in the second case, the regular expression object is created during the parsing of the language, and in the first case, the RegExp class constructor has to parse an arbitrary string.
Given this function:
function doThing(values,things){
var thatRegex = /^http:\/\//i; // is this created once or on every execution?
if (values.match(thatRegex)) return values;
return things;
}
How often does the JavaScript engine have to create the regex? Once per execution or once per page load/script parse?
To prevent needless answers or comments, I personally favor putting the regex outside the function, not inside. The question is about the behavior of the language, because I'm not sure where to look this up, or if this is an engine issue.
EDIT:
I was reminded I didn't mention that this was going to be used in a loop. My apologies:
var newList = [];
foreach(item1 in ListOfItems1){
foreach(item2 in ListOfItems2){
newList.push(doThing(item1, item2));
}
}
So given that it's going to be used many times in a loop, it makes sense to define the regex outside the function, but so that's the idea.
also note the script is rather genericized for the purpose of examining only the behavior and cost of the regex creation
From Mozilla's JavaScript Guide on regular expressions:
Regular expression literals provide compilation of the regular expression when the script is evaluated. When the regular expression will remain constant, use this for better performance.
And from the ECMA-262 spec, §7.8.5 Regular Expression Literals:
A regular expression literal is an input element that is converted to a RegExp object (see 15.10) each time the literal is evaluated.
In other words, it's compiled once when it's evaluated as a script is first parsed.
It's worth noting also, from the ES5 spec, that two literals will compile to two distinct instances of RegExp, even if the literals themselves are the same. Thus if a given literal appears twice within your script, it will be compiled twice, to two distinct instances:
Two regular expression literals in a program evaluate to regular expression objects that never compare as === to each other even if the two literals' contents are identical.
...
... each time the literal is evaluated, a new object is created as if by the expression new RegExp(Pattern, Flags) where RegExp is the standard built-in constructor with that name.
The provided answers don't clearly distinguish between two different processes behind the scene: regexp compilation and regexp object creation when hitting regexp object creation expression.
Yes, using regexp literal syntax, you're gaining the performance benefit of one time regexp compilation.
But if your code executes in ES5+ environment, every time the code path enters the doThing() function in your example, it actually creates a new RegExp object, though, without need to compile the regexp again and again.
In ES5, literal syntax produces a new RegExp object every time code path hits expression that creates a regexp via literal:
function getRE() {
var re = /[a-z]/;
re.foo = "bar";
return re;
}
var reg = getRE(),
re2 = getRE();
console.log(reg === re2); // false
reg.foo = "baz";
console.log(re2.foo); // "bar"
To illustrate the above statements from the point of actual numbers, take a look at the performance difference between storedRegExp and inlineRegExp tests in this jsperf.
storedRegExp would be about 5 - 20% percent faster across browsers than inlineRegExp - the overhead of creating (and garbage collecting) a new RegExp object every time.
Conslusion:
If you're heavily using your literal regexps, consider caching them outside the scope where they are needed, so that they are not only be compiled once, but actual regexp objects for them would be created once as well.
There are two "regular expression" type objects in javascript.
Regular expression instances and the RegExp object.
Also, there are two ways to create regular expression instances:
using the /regex/ syntax and
using new RegExp('regex');
Each of these will create new regular expression instance each time.
However there is only ONE global RegExp object.
var input = 'abcdef';
var r1 = /(abc)/;
var r2 = /(def)/;
r1.exec(input);
alert(RegExp.$1); //outputs 'abc'
r2.exec(input);
alert(RegExp.$1); //outputs 'def'
The actual pattern is compiled as the script is loaded when you use Syntax 1
The pattern argument is compiled into an internal format before use. For Syntax 1, pattern is compiled as the script is loaded. For Syntax 2, pattern is compiled just before use, or when the compile method is called.
But you still could get different regular expression instances each method call. Test in chrome vs firefox
function testregex() {
var localreg = /abc/;
if (testregex.reg != null){
alert(localreg === testregex.reg);
};
testregex.reg = localreg;
}
testregex();
testregex();
It's VERY little overhead, but if you wanted exactly one regex, its safest to only create one instance outside of your function
The regex will be compiled every time you call the function if it's not in literal form.
Since you are including it in a literal form, you've got nothing to worry about.
Here's a quote from websina.com:
Regular expression literals provide compilation of the regular expression when the script is evaluated. When the regular expression will remain constant, use this for better performance.
Calling the constructor function of the RegExp object, as follows:
re = new RegExp("ab+c")
Using the constructor function provides runtime compilation of the regular expression. Use the constructor function when you know the regular expression pattern will be changing, or you don't know the pattern and are getting it from another source, such as user input.