Why such result [duplicate] - javascript

Well, first I should probably ask if this is browser dependent.
I've read that if an invalid token is found, but the section of code is valid until that invalid token, a semicolon is inserted before the token if it is preceded by a line break.
However, the common example cited for bugs caused by semicolon insertion is:
return
_a+b;
..which doesn't seem to follow this rule, since _a would be a valid token.
On the other hand, breaking up call chains works as expected:
$('#myButton')
.click(function(){alert("Hello!")});
Does anyone have a more in-depth description of the rules?

First of all you should know which statements are affected by the automatic semicolon insertion (also known as ASI for brevity):
empty statement
var statement
expression statement
do-while statement
continue statement
break statement
return statement
throw statement
The concrete rules of ASI, are described in the specification §11.9.1 Rules of Automatic Semicolon Insertion
Three cases are described:
When an offending token is encountered that is not allowed by the grammar, a semicolon is inserted before it if:
The token is separated from the previous token by at least one LineTerminator.
The token is }
e.g.:
{ 1
2 } 3
is transformed to
{ 1
;2 ;} 3;
The NumericLiteral 1 meets the first condition, the following token is a line terminator.
The 2 meets the second condition, the following token is }.
When the end of the input stream of tokens is encountered and the parser is unable to parse the input token stream as a single complete Program, then a semicolon is automatically inserted at the end of the input stream.
e.g.:
a = b
++c
is transformed to:
a = b;
++c;
This case occurs when a token is allowed by some production of the grammar, but the production is a restricted production, a semicolon is automatically inserted before the restricted token.
Restricted productions:
UpdateExpression :
LeftHandSideExpression [no LineTerminator here] ++
LeftHandSideExpression [no LineTerminator here] --
ContinueStatement :
continue ;
continue [no LineTerminator here] LabelIdentifier ;
BreakStatement :
break ;
break [no LineTerminator here] LabelIdentifier ;
ReturnStatement :
return ;
return [no LineTerminator here] Expression ;
ThrowStatement :
throw [no LineTerminator here] Expression ;
ArrowFunction :
ArrowParameters [no LineTerminator here] => ConciseBody
YieldExpression :
yield [no LineTerminator here] * AssignmentExpression
yield [no LineTerminator here] AssignmentExpression
The classic example, with the ReturnStatement:
return
"something";
is transformed to
return;
"something";

I could not understand those 3 rules in the specs too well -- hope to have something that is more plain English -- but here is what I gathered from JavaScript: The Definitive Guide, 6th Edition, David Flanagan, O'Reilly, 2011:
Quote:
JavaScript does not treat every line break as a semicolon: it usually treats line breaks as semicolons only if it can’t parse the code without the semicolons.
Another quote: for the code
var a
a
=
3 console.log(a)
JavaScript does not treat the second line break as a semicolon because it can continue parsing the longer statement a = 3;
and:
two exceptions to the general rule that JavaScript interprets line breaks as semicolons when it cannot parse the second line as a continuation of the statement on the first line. The first exception involves the return, break, and continue statements
... If a line break appears after any of these words ... JavaScript will always interpret that line break as a semicolon.
... The second exception involves the ++ and −− operators ... If you want to use either of these operators as postfix operators, they must appear on the same line as the expression they apply to. Otherwise, the line break will be treated as a semicolon, and the ++ or -- will be parsed as a prefix operator applied to the code that follows. Consider this code, for example:
x
++
y
It is parsed as x; ++y;, not as x++; y
So I think to simplify it, that means:
In general, JavaScript will treat it as continuation of code as long as it makes sense -- except 2 cases: (1) after some keywords like return, break, continue, and (2) if it sees ++ or -- on a new line, then it will add the ; at the end of the previous line.
The part about "treat it as continuation of code as long as it makes sense" makes it feel like regular expression's greedy matching.
With the above said, that means for return with a line break, the JavaScript interpreter will insert a ;
(quoted again: If a line break appears after any of these words [such as return] ... JavaScript will always interpret that line break as a semicolon)
and due to this reason, the classic example of
return
{
foo: 1
}
will not work as expected, because the JavaScript interpreter will treat it as:
return; // returning nothing
{
foo: 1
}
There has to be no line-break immediately after the return:
return {
foo: 1
}
for it to work properly. And you may insert a ; yourself if you were to follow the rule of using a ; after any statement:
return {
foo: 1
};

Straight from the ECMA-262, Fifth Edition ECMAScript Specification:
7.9.1 Rules of Automatic Semicolon Insertion
There are three basic rules of semicolon insertion:
When, as the program is parsed from left to right, a token (called the offending token) is encountered that is not allowed by any production of the grammar, then a semicolon is automatically inserted before the offending token if one or more of the following conditions is true:
The offending token is separated from the previous token by at least one LineTerminator.
The offending token is }.
When, as the program is parsed from left to right, the end of the input stream of tokens is encountered and the parser is unable to parse the input token stream as a single complete ECMAScript Program, then a semicolon is automatically inserted at the end of the input stream.
When, as the program is parsed from left to right, a token is encountered that is allowed by some production of the grammar, but the production is a restricted production and the token would be the first token for a terminal or nonterminal immediately following the annotation "[no LineTerminator here]" within the restricted production (and therefore such a token is called a restricted token), and the restricted token is separated from the previous token by at least one LineTerminator, then a semicolon is automatically inserted before the restricted token.
However, there is an additional overriding condition on the preceding rules: a semicolon is never inserted automatically if the semicolon would then be parsed as an empty statement or if that semicolon would become one of the two semicolons in the header of a for statement (see 12.6.3).

Regarding semicolon insertion and the var statement, beware forgetting the comma when using var but spanning multiple lines. Somebody found this in my code yesterday:
var srcRecords = src.records
srcIds = [];
It ran but the effect was that the srcIds declaration/assignment was global because the local declaration with var on the previous line no longer applied as that statement was considered finished due to automatic semi-colon insertion.

The most contextual description of JavaScript's Automatic Semicolon Insertion I have found comes from a book about Crafting Interpreters.
JavaScript’s “automatic semicolon insertion” rule is the odd one. Where other languages assume most newlines are meaningful and only a few should be ignored in multi-line statements, JS assumes the opposite. It treats all of your newlines as meaningless whitespace unless it encounters a parse error. If it does, it goes back and tries turning the previous newline into a semicolon to get something grammatically valid.
He goes on to describe it as you would code smell.
This design note would turn into a design diatribe if I went into complete detail about how that even works, much less all the various ways that that is a bad idea. It’s a mess. JavaScript is the only language I know where many style guides demand explicit semicolons after every statement even though the language theoretically lets you elide them.

Just to add,
const foo = function(){ return "foo" } //this doesn't add a semicolon here.
(function (){
console.log("aa");
})()
see this, using immediately invoked function expression(IIFE)

Most statements and declarations in JavaScript must be terminated with a semicolon, however, for the convenience of the programmer (less typing, stylistic preference, less code noise, lower barrier to entry), semicolons may be omitted in some source text locations, with the runtime automatically inserting semicolons according to a set of rules set-out in the spec.
Over-arching rules: a semicolon is never inserted automatically if the semicolon would then be parsed as an empty statement or if that semicolon would become one of the two semicolons in the header of a for statement.
Rule 1
A semicolon will be automatically inserted if a token is encountered by the JavaScript parser that both would not be allowed if a semicolon did not exist, and that token is separated from the previous by one or more line terminators (eg. newlines), a closing brace }, or the final parenthesis ()) of a do-while loop.
In other words: source text locations where statements would always need to be terminated anyway for a runnable program, will have the statement terminator (;) inserted automatically if it is omitted. This rule is the heart of ASI.
Rule 2
A semicolon will be inserted at the end of the program if the source text is not otherwise a valid script or module. In other words: programmers can omit the final semicolon in a program.
Rule 3
A semicolon will be automatically inserted if a token is encountered that would normally be allowed if a semicolon did not exist, but exists within one of several special source text locations (restricted productions) that explicitly disallow line terminators within them for reasons of avoiding ambiguity.
The restricted productions inside of which line terminators are prohibited are:
before postfix ++ and postfix -- (so the unary increment/decrement operators after a newline will bind to the following (not previous) statement, as a prefix operator)
after continue, break, throw, return, yield
after arrow function parameter lists, and
after the async keyword in async function declarations & expressions, generator function declarations & expressions & methods, and async arrow functions
The spec contains the full details, plus the following practical advice:
The resulting practical advice to ECMAScript programmers is:
A postfix ++ or -- operator should be on the same line as its operand.
An Expression in a return or throw statement or an
AssignmentExpression in a yield expression should start on the same
line as the return, throw, or yield token.
A LabelIdentifier in a break or continue statement should be on the same line as the break or continue token.
The end of an arrow function's parameter(s) and its => should be on the same line.
The async token preceding an asynchronous function or method should be on the same line as the immediately following token.
And this is the best article on the Web on this subject.
ASI Gotcha Examples
Starting a line with `(`
The opening parenthesis character has multiple meanings. It can delineate an expression, or it can indicate an invocation (when paired with a closing parenthesis).
For example, the following throws "Uncaught TypeError: console.log(...) is not a function" because the runtime attempts to invoke the return value of console.log('bar'):
let a = 'foo'
console.log('bar')
(a = 'bam')
One solution for this, if you are generally omitting semicolons, is to include a semicolon to make your intentions unambiguous:
let a = 'foo'
console.log('bar')
;(a = 'bam') // note semicolon at start of line
Starting a line with `[`
The opening bracket character ([) has multiple meanings. It can indicate an object property access, or it can indicate the literal declaration of an array (when paired with a closing bracket), or it can indicate an array destructuring.
For example, the following throws "Uncaught TypeError: Cannot set properties of undefined (setting 'foo')" because the runtime attempts to set the value of a property named 'foo' on the response of console.log('bar'):
let a = 'foo'
console.log('bar')
[a] = ['bam']
One solution for this, if you are generally omitting semicolons, is to include a semicolon to make your intentions unambiguous:
let a = 'foo'
console.log('bar')
;[a] = ['bam'] // note semicolon at start of line

Related

Why JavaScript throw error when missing semicolon in this situation? [duplicate]

Well, first I should probably ask if this is browser dependent.
I've read that if an invalid token is found, but the section of code is valid until that invalid token, a semicolon is inserted before the token if it is preceded by a line break.
However, the common example cited for bugs caused by semicolon insertion is:
return
_a+b;
..which doesn't seem to follow this rule, since _a would be a valid token.
On the other hand, breaking up call chains works as expected:
$('#myButton')
.click(function(){alert("Hello!")});
Does anyone have a more in-depth description of the rules?
First of all you should know which statements are affected by the automatic semicolon insertion (also known as ASI for brevity):
empty statement
var statement
expression statement
do-while statement
continue statement
break statement
return statement
throw statement
The concrete rules of ASI, are described in the specification §11.9.1 Rules of Automatic Semicolon Insertion
Three cases are described:
When an offending token is encountered that is not allowed by the grammar, a semicolon is inserted before it if:
The token is separated from the previous token by at least one LineTerminator.
The token is }
e.g.:
{ 1
2 } 3
is transformed to
{ 1
;2 ;} 3;
The NumericLiteral 1 meets the first condition, the following token is a line terminator.
The 2 meets the second condition, the following token is }.
When the end of the input stream of tokens is encountered and the parser is unable to parse the input token stream as a single complete Program, then a semicolon is automatically inserted at the end of the input stream.
e.g.:
a = b
++c
is transformed to:
a = b;
++c;
This case occurs when a token is allowed by some production of the grammar, but the production is a restricted production, a semicolon is automatically inserted before the restricted token.
Restricted productions:
UpdateExpression :
LeftHandSideExpression [no LineTerminator here] ++
LeftHandSideExpression [no LineTerminator here] --
ContinueStatement :
continue ;
continue [no LineTerminator here] LabelIdentifier ;
BreakStatement :
break ;
break [no LineTerminator here] LabelIdentifier ;
ReturnStatement :
return ;
return [no LineTerminator here] Expression ;
ThrowStatement :
throw [no LineTerminator here] Expression ;
ArrowFunction :
ArrowParameters [no LineTerminator here] => ConciseBody
YieldExpression :
yield [no LineTerminator here] * AssignmentExpression
yield [no LineTerminator here] AssignmentExpression
The classic example, with the ReturnStatement:
return
"something";
is transformed to
return;
"something";
I could not understand those 3 rules in the specs too well -- hope to have something that is more plain English -- but here is what I gathered from JavaScript: The Definitive Guide, 6th Edition, David Flanagan, O'Reilly, 2011:
Quote:
JavaScript does not treat every line break as a semicolon: it usually treats line breaks as semicolons only if it can’t parse the code without the semicolons.
Another quote: for the code
var a
a
=
3 console.log(a)
JavaScript does not treat the second line break as a semicolon because it can continue parsing the longer statement a = 3;
and:
two exceptions to the general rule that JavaScript interprets line breaks as semicolons when it cannot parse the second line as a continuation of the statement on the first line. The first exception involves the return, break, and continue statements
... If a line break appears after any of these words ... JavaScript will always interpret that line break as a semicolon.
... The second exception involves the ++ and −− operators ... If you want to use either of these operators as postfix operators, they must appear on the same line as the expression they apply to. Otherwise, the line break will be treated as a semicolon, and the ++ or -- will be parsed as a prefix operator applied to the code that follows. Consider this code, for example:
x
++
y
It is parsed as x; ++y;, not as x++; y
So I think to simplify it, that means:
In general, JavaScript will treat it as continuation of code as long as it makes sense -- except 2 cases: (1) after some keywords like return, break, continue, and (2) if it sees ++ or -- on a new line, then it will add the ; at the end of the previous line.
The part about "treat it as continuation of code as long as it makes sense" makes it feel like regular expression's greedy matching.
With the above said, that means for return with a line break, the JavaScript interpreter will insert a ;
(quoted again: If a line break appears after any of these words [such as return] ... JavaScript will always interpret that line break as a semicolon)
and due to this reason, the classic example of
return
{
foo: 1
}
will not work as expected, because the JavaScript interpreter will treat it as:
return; // returning nothing
{
foo: 1
}
There has to be no line-break immediately after the return:
return {
foo: 1
}
for it to work properly. And you may insert a ; yourself if you were to follow the rule of using a ; after any statement:
return {
foo: 1
};
Straight from the ECMA-262, Fifth Edition ECMAScript Specification:
7.9.1 Rules of Automatic Semicolon Insertion
There are three basic rules of semicolon insertion:
When, as the program is parsed from left to right, a token (called the offending token) is encountered that is not allowed by any production of the grammar, then a semicolon is automatically inserted before the offending token if one or more of the following conditions is true:
The offending token is separated from the previous token by at least one LineTerminator.
The offending token is }.
When, as the program is parsed from left to right, the end of the input stream of tokens is encountered and the parser is unable to parse the input token stream as a single complete ECMAScript Program, then a semicolon is automatically inserted at the end of the input stream.
When, as the program is parsed from left to right, a token is encountered that is allowed by some production of the grammar, but the production is a restricted production and the token would be the first token for a terminal or nonterminal immediately following the annotation "[no LineTerminator here]" within the restricted production (and therefore such a token is called a restricted token), and the restricted token is separated from the previous token by at least one LineTerminator, then a semicolon is automatically inserted before the restricted token.
However, there is an additional overriding condition on the preceding rules: a semicolon is never inserted automatically if the semicolon would then be parsed as an empty statement or if that semicolon would become one of the two semicolons in the header of a for statement (see 12.6.3).
Regarding semicolon insertion and the var statement, beware forgetting the comma when using var but spanning multiple lines. Somebody found this in my code yesterday:
var srcRecords = src.records
srcIds = [];
It ran but the effect was that the srcIds declaration/assignment was global because the local declaration with var on the previous line no longer applied as that statement was considered finished due to automatic semi-colon insertion.
The most contextual description of JavaScript's Automatic Semicolon Insertion I have found comes from a book about Crafting Interpreters.
JavaScript’s “automatic semicolon insertion” rule is the odd one. Where other languages assume most newlines are meaningful and only a few should be ignored in multi-line statements, JS assumes the opposite. It treats all of your newlines as meaningless whitespace unless it encounters a parse error. If it does, it goes back and tries turning the previous newline into a semicolon to get something grammatically valid.
He goes on to describe it as you would code smell.
This design note would turn into a design diatribe if I went into complete detail about how that even works, much less all the various ways that that is a bad idea. It’s a mess. JavaScript is the only language I know where many style guides demand explicit semicolons after every statement even though the language theoretically lets you elide them.
Just to add,
const foo = function(){ return "foo" } //this doesn't add a semicolon here.
(function (){
console.log("aa");
})()
see this, using immediately invoked function expression(IIFE)
Most statements and declarations in JavaScript must be terminated with a semicolon, however, for the convenience of the programmer (less typing, stylistic preference, less code noise, lower barrier to entry), semicolons may be omitted in some source text locations, with the runtime automatically inserting semicolons according to a set of rules set-out in the spec.
Over-arching rules: a semicolon is never inserted automatically if the semicolon would then be parsed as an empty statement or if that semicolon would become one of the two semicolons in the header of a for statement.
Rule 1
A semicolon will be automatically inserted if a token is encountered by the JavaScript parser that both would not be allowed if a semicolon did not exist, and that token is separated from the previous by one or more line terminators (eg. newlines), a closing brace }, or the final parenthesis ()) of a do-while loop.
In other words: source text locations where statements would always need to be terminated anyway for a runnable program, will have the statement terminator (;) inserted automatically if it is omitted. This rule is the heart of ASI.
Rule 2
A semicolon will be inserted at the end of the program if the source text is not otherwise a valid script or module. In other words: programmers can omit the final semicolon in a program.
Rule 3
A semicolon will be automatically inserted if a token is encountered that would normally be allowed if a semicolon did not exist, but exists within one of several special source text locations (restricted productions) that explicitly disallow line terminators within them for reasons of avoiding ambiguity.
The restricted productions inside of which line terminators are prohibited are:
before postfix ++ and postfix -- (so the unary increment/decrement operators after a newline will bind to the following (not previous) statement, as a prefix operator)
after continue, break, throw, return, yield
after arrow function parameter lists, and
after the async keyword in async function declarations & expressions, generator function declarations & expressions & methods, and async arrow functions
The spec contains the full details, plus the following practical advice:
The resulting practical advice to ECMAScript programmers is:
A postfix ++ or -- operator should be on the same line as its operand.
An Expression in a return or throw statement or an
AssignmentExpression in a yield expression should start on the same
line as the return, throw, or yield token.
A LabelIdentifier in a break or continue statement should be on the same line as the break or continue token.
The end of an arrow function's parameter(s) and its => should be on the same line.
The async token preceding an asynchronous function or method should be on the same line as the immediately following token.
And this is the best article on the Web on this subject.
ASI Gotcha Examples
Starting a line with `(`
The opening parenthesis character has multiple meanings. It can delineate an expression, or it can indicate an invocation (when paired with a closing parenthesis).
For example, the following throws "Uncaught TypeError: console.log(...) is not a function" because the runtime attempts to invoke the return value of console.log('bar'):
let a = 'foo'
console.log('bar')
(a = 'bam')
One solution for this, if you are generally omitting semicolons, is to include a semicolon to make your intentions unambiguous:
let a = 'foo'
console.log('bar')
;(a = 'bam') // note semicolon at start of line
Starting a line with `[`
The opening bracket character ([) has multiple meanings. It can indicate an object property access, or it can indicate the literal declaration of an array (when paired with a closing bracket), or it can indicate an array destructuring.
For example, the following throws "Uncaught TypeError: Cannot set properties of undefined (setting 'foo')" because the runtime attempts to set the value of a property named 'foo' on the response of console.log('bar'):
let a = 'foo'
console.log('bar')
[a] = ['bam']
One solution for this, if you are generally omitting semicolons, is to include a semicolon to make your intentions unambiguous:
let a = 'foo'
console.log('bar')
;[a] = ['bam'] // note semicolon at start of line

Why do newlines effect eval return? [duplicate]

Well, first I should probably ask if this is browser dependent.
I've read that if an invalid token is found, but the section of code is valid until that invalid token, a semicolon is inserted before the token if it is preceded by a line break.
However, the common example cited for bugs caused by semicolon insertion is:
return
_a+b;
..which doesn't seem to follow this rule, since _a would be a valid token.
On the other hand, breaking up call chains works as expected:
$('#myButton')
.click(function(){alert("Hello!")});
Does anyone have a more in-depth description of the rules?
First of all you should know which statements are affected by the automatic semicolon insertion (also known as ASI for brevity):
empty statement
var statement
expression statement
do-while statement
continue statement
break statement
return statement
throw statement
The concrete rules of ASI, are described in the specification §11.9.1 Rules of Automatic Semicolon Insertion
Three cases are described:
When an offending token is encountered that is not allowed by the grammar, a semicolon is inserted before it if:
The token is separated from the previous token by at least one LineTerminator.
The token is }
e.g.:
{ 1
2 } 3
is transformed to
{ 1
;2 ;} 3;
The NumericLiteral 1 meets the first condition, the following token is a line terminator.
The 2 meets the second condition, the following token is }.
When the end of the input stream of tokens is encountered and the parser is unable to parse the input token stream as a single complete Program, then a semicolon is automatically inserted at the end of the input stream.
e.g.:
a = b
++c
is transformed to:
a = b;
++c;
This case occurs when a token is allowed by some production of the grammar, but the production is a restricted production, a semicolon is automatically inserted before the restricted token.
Restricted productions:
UpdateExpression :
LeftHandSideExpression [no LineTerminator here] ++
LeftHandSideExpression [no LineTerminator here] --
ContinueStatement :
continue ;
continue [no LineTerminator here] LabelIdentifier ;
BreakStatement :
break ;
break [no LineTerminator here] LabelIdentifier ;
ReturnStatement :
return ;
return [no LineTerminator here] Expression ;
ThrowStatement :
throw [no LineTerminator here] Expression ;
ArrowFunction :
ArrowParameters [no LineTerminator here] => ConciseBody
YieldExpression :
yield [no LineTerminator here] * AssignmentExpression
yield [no LineTerminator here] AssignmentExpression
The classic example, with the ReturnStatement:
return
"something";
is transformed to
return;
"something";
I could not understand those 3 rules in the specs too well -- hope to have something that is more plain English -- but here is what I gathered from JavaScript: The Definitive Guide, 6th Edition, David Flanagan, O'Reilly, 2011:
Quote:
JavaScript does not treat every line break as a semicolon: it usually treats line breaks as semicolons only if it can’t parse the code without the semicolons.
Another quote: for the code
var a
a
=
3 console.log(a)
JavaScript does not treat the second line break as a semicolon because it can continue parsing the longer statement a = 3;
and:
two exceptions to the general rule that JavaScript interprets line breaks as semicolons when it cannot parse the second line as a continuation of the statement on the first line. The first exception involves the return, break, and continue statements
... If a line break appears after any of these words ... JavaScript will always interpret that line break as a semicolon.
... The second exception involves the ++ and −− operators ... If you want to use either of these operators as postfix operators, they must appear on the same line as the expression they apply to. Otherwise, the line break will be treated as a semicolon, and the ++ or -- will be parsed as a prefix operator applied to the code that follows. Consider this code, for example:
x
++
y
It is parsed as x; ++y;, not as x++; y
So I think to simplify it, that means:
In general, JavaScript will treat it as continuation of code as long as it makes sense -- except 2 cases: (1) after some keywords like return, break, continue, and (2) if it sees ++ or -- on a new line, then it will add the ; at the end of the previous line.
The part about "treat it as continuation of code as long as it makes sense" makes it feel like regular expression's greedy matching.
With the above said, that means for return with a line break, the JavaScript interpreter will insert a ;
(quoted again: If a line break appears after any of these words [such as return] ... JavaScript will always interpret that line break as a semicolon)
and due to this reason, the classic example of
return
{
foo: 1
}
will not work as expected, because the JavaScript interpreter will treat it as:
return; // returning nothing
{
foo: 1
}
There has to be no line-break immediately after the return:
return {
foo: 1
}
for it to work properly. And you may insert a ; yourself if you were to follow the rule of using a ; after any statement:
return {
foo: 1
};
Straight from the ECMA-262, Fifth Edition ECMAScript Specification:
7.9.1 Rules of Automatic Semicolon Insertion
There are three basic rules of semicolon insertion:
When, as the program is parsed from left to right, a token (called the offending token) is encountered that is not allowed by any production of the grammar, then a semicolon is automatically inserted before the offending token if one or more of the following conditions is true:
The offending token is separated from the previous token by at least one LineTerminator.
The offending token is }.
When, as the program is parsed from left to right, the end of the input stream of tokens is encountered and the parser is unable to parse the input token stream as a single complete ECMAScript Program, then a semicolon is automatically inserted at the end of the input stream.
When, as the program is parsed from left to right, a token is encountered that is allowed by some production of the grammar, but the production is a restricted production and the token would be the first token for a terminal or nonterminal immediately following the annotation "[no LineTerminator here]" within the restricted production (and therefore such a token is called a restricted token), and the restricted token is separated from the previous token by at least one LineTerminator, then a semicolon is automatically inserted before the restricted token.
However, there is an additional overriding condition on the preceding rules: a semicolon is never inserted automatically if the semicolon would then be parsed as an empty statement or if that semicolon would become one of the two semicolons in the header of a for statement (see 12.6.3).
Regarding semicolon insertion and the var statement, beware forgetting the comma when using var but spanning multiple lines. Somebody found this in my code yesterday:
var srcRecords = src.records
srcIds = [];
It ran but the effect was that the srcIds declaration/assignment was global because the local declaration with var on the previous line no longer applied as that statement was considered finished due to automatic semi-colon insertion.
The most contextual description of JavaScript's Automatic Semicolon Insertion I have found comes from a book about Crafting Interpreters.
JavaScript’s “automatic semicolon insertion” rule is the odd one. Where other languages assume most newlines are meaningful and only a few should be ignored in multi-line statements, JS assumes the opposite. It treats all of your newlines as meaningless whitespace unless it encounters a parse error. If it does, it goes back and tries turning the previous newline into a semicolon to get something grammatically valid.
He goes on to describe it as you would code smell.
This design note would turn into a design diatribe if I went into complete detail about how that even works, much less all the various ways that that is a bad idea. It’s a mess. JavaScript is the only language I know where many style guides demand explicit semicolons after every statement even though the language theoretically lets you elide them.
Just to add,
const foo = function(){ return "foo" } //this doesn't add a semicolon here.
(function (){
console.log("aa");
})()
see this, using immediately invoked function expression(IIFE)
Most statements and declarations in JavaScript must be terminated with a semicolon, however, for the convenience of the programmer (less typing, stylistic preference, less code noise, lower barrier to entry), semicolons may be omitted in some source text locations, with the runtime automatically inserting semicolons according to a set of rules set-out in the spec.
Over-arching rules: a semicolon is never inserted automatically if the semicolon would then be parsed as an empty statement or if that semicolon would become one of the two semicolons in the header of a for statement.
Rule 1
A semicolon will be automatically inserted if a token is encountered by the JavaScript parser that both would not be allowed if a semicolon did not exist, and that token is separated from the previous by one or more line terminators (eg. newlines), a closing brace }, or the final parenthesis ()) of a do-while loop.
In other words: source text locations where statements would always need to be terminated anyway for a runnable program, will have the statement terminator (;) inserted automatically if it is omitted. This rule is the heart of ASI.
Rule 2
A semicolon will be inserted at the end of the program if the source text is not otherwise a valid script or module. In other words: programmers can omit the final semicolon in a program.
Rule 3
A semicolon will be automatically inserted if a token is encountered that would normally be allowed if a semicolon did not exist, but exists within one of several special source text locations (restricted productions) that explicitly disallow line terminators within them for reasons of avoiding ambiguity.
The restricted productions inside of which line terminators are prohibited are:
before postfix ++ and postfix -- (so the unary increment/decrement operators after a newline will bind to the following (not previous) statement, as a prefix operator)
after continue, break, throw, return, yield
after arrow function parameter lists, and
after the async keyword in async function declarations & expressions, generator function declarations & expressions & methods, and async arrow functions
The spec contains the full details, plus the following practical advice:
The resulting practical advice to ECMAScript programmers is:
A postfix ++ or -- operator should be on the same line as its operand.
An Expression in a return or throw statement or an
AssignmentExpression in a yield expression should start on the same
line as the return, throw, or yield token.
A LabelIdentifier in a break or continue statement should be on the same line as the break or continue token.
The end of an arrow function's parameter(s) and its => should be on the same line.
The async token preceding an asynchronous function or method should be on the same line as the immediately following token.
And this is the best article on the Web on this subject.
ASI Gotcha Examples
Starting a line with `(`
The opening parenthesis character has multiple meanings. It can delineate an expression, or it can indicate an invocation (when paired with a closing parenthesis).
For example, the following throws "Uncaught TypeError: console.log(...) is not a function" because the runtime attempts to invoke the return value of console.log('bar'):
let a = 'foo'
console.log('bar')
(a = 'bam')
One solution for this, if you are generally omitting semicolons, is to include a semicolon to make your intentions unambiguous:
let a = 'foo'
console.log('bar')
;(a = 'bam') // note semicolon at start of line
Starting a line with `[`
The opening bracket character ([) has multiple meanings. It can indicate an object property access, or it can indicate the literal declaration of an array (when paired with a closing bracket), or it can indicate an array destructuring.
For example, the following throws "Uncaught TypeError: Cannot set properties of undefined (setting 'foo')" because the runtime attempts to set the value of a property named 'foo' on the response of console.log('bar'):
let a = 'foo'
console.log('bar')
[a] = ['bam']
One solution for this, if you are generally omitting semicolons, is to include a semicolon to make your intentions unambiguous:
let a = 'foo'
console.log('bar')
;[a] = ['bam'] // note semicolon at start of line

Why no Automatic semicolon insertion(ASI) for square bracket at the new line? [duplicate]

Well, first I should probably ask if this is browser dependent.
I've read that if an invalid token is found, but the section of code is valid until that invalid token, a semicolon is inserted before the token if it is preceded by a line break.
However, the common example cited for bugs caused by semicolon insertion is:
return
_a+b;
..which doesn't seem to follow this rule, since _a would be a valid token.
On the other hand, breaking up call chains works as expected:
$('#myButton')
.click(function(){alert("Hello!")});
Does anyone have a more in-depth description of the rules?
First of all you should know which statements are affected by the automatic semicolon insertion (also known as ASI for brevity):
empty statement
var statement
expression statement
do-while statement
continue statement
break statement
return statement
throw statement
The concrete rules of ASI, are described in the specification §11.9.1 Rules of Automatic Semicolon Insertion
Three cases are described:
When an offending token is encountered that is not allowed by the grammar, a semicolon is inserted before it if:
The token is separated from the previous token by at least one LineTerminator.
The token is }
e.g.:
{ 1
2 } 3
is transformed to
{ 1
;2 ;} 3;
The NumericLiteral 1 meets the first condition, the following token is a line terminator.
The 2 meets the second condition, the following token is }.
When the end of the input stream of tokens is encountered and the parser is unable to parse the input token stream as a single complete Program, then a semicolon is automatically inserted at the end of the input stream.
e.g.:
a = b
++c
is transformed to:
a = b;
++c;
This case occurs when a token is allowed by some production of the grammar, but the production is a restricted production, a semicolon is automatically inserted before the restricted token.
Restricted productions:
UpdateExpression :
LeftHandSideExpression [no LineTerminator here] ++
LeftHandSideExpression [no LineTerminator here] --
ContinueStatement :
continue ;
continue [no LineTerminator here] LabelIdentifier ;
BreakStatement :
break ;
break [no LineTerminator here] LabelIdentifier ;
ReturnStatement :
return ;
return [no LineTerminator here] Expression ;
ThrowStatement :
throw [no LineTerminator here] Expression ;
ArrowFunction :
ArrowParameters [no LineTerminator here] => ConciseBody
YieldExpression :
yield [no LineTerminator here] * AssignmentExpression
yield [no LineTerminator here] AssignmentExpression
The classic example, with the ReturnStatement:
return
"something";
is transformed to
return;
"something";
I could not understand those 3 rules in the specs too well -- hope to have something that is more plain English -- but here is what I gathered from JavaScript: The Definitive Guide, 6th Edition, David Flanagan, O'Reilly, 2011:
Quote:
JavaScript does not treat every line break as a semicolon: it usually treats line breaks as semicolons only if it can’t parse the code without the semicolons.
Another quote: for the code
var a
a
=
3 console.log(a)
JavaScript does not treat the second line break as a semicolon because it can continue parsing the longer statement a = 3;
and:
two exceptions to the general rule that JavaScript interprets line breaks as semicolons when it cannot parse the second line as a continuation of the statement on the first line. The first exception involves the return, break, and continue statements
... If a line break appears after any of these words ... JavaScript will always interpret that line break as a semicolon.
... The second exception involves the ++ and −− operators ... If you want to use either of these operators as postfix operators, they must appear on the same line as the expression they apply to. Otherwise, the line break will be treated as a semicolon, and the ++ or -- will be parsed as a prefix operator applied to the code that follows. Consider this code, for example:
x
++
y
It is parsed as x; ++y;, not as x++; y
So I think to simplify it, that means:
In general, JavaScript will treat it as continuation of code as long as it makes sense -- except 2 cases: (1) after some keywords like return, break, continue, and (2) if it sees ++ or -- on a new line, then it will add the ; at the end of the previous line.
The part about "treat it as continuation of code as long as it makes sense" makes it feel like regular expression's greedy matching.
With the above said, that means for return with a line break, the JavaScript interpreter will insert a ;
(quoted again: If a line break appears after any of these words [such as return] ... JavaScript will always interpret that line break as a semicolon)
and due to this reason, the classic example of
return
{
foo: 1
}
will not work as expected, because the JavaScript interpreter will treat it as:
return; // returning nothing
{
foo: 1
}
There has to be no line-break immediately after the return:
return {
foo: 1
}
for it to work properly. And you may insert a ; yourself if you were to follow the rule of using a ; after any statement:
return {
foo: 1
};
Straight from the ECMA-262, Fifth Edition ECMAScript Specification:
7.9.1 Rules of Automatic Semicolon Insertion
There are three basic rules of semicolon insertion:
When, as the program is parsed from left to right, a token (called the offending token) is encountered that is not allowed by any production of the grammar, then a semicolon is automatically inserted before the offending token if one or more of the following conditions is true:
The offending token is separated from the previous token by at least one LineTerminator.
The offending token is }.
When, as the program is parsed from left to right, the end of the input stream of tokens is encountered and the parser is unable to parse the input token stream as a single complete ECMAScript Program, then a semicolon is automatically inserted at the end of the input stream.
When, as the program is parsed from left to right, a token is encountered that is allowed by some production of the grammar, but the production is a restricted production and the token would be the first token for a terminal or nonterminal immediately following the annotation "[no LineTerminator here]" within the restricted production (and therefore such a token is called a restricted token), and the restricted token is separated from the previous token by at least one LineTerminator, then a semicolon is automatically inserted before the restricted token.
However, there is an additional overriding condition on the preceding rules: a semicolon is never inserted automatically if the semicolon would then be parsed as an empty statement or if that semicolon would become one of the two semicolons in the header of a for statement (see 12.6.3).
Regarding semicolon insertion and the var statement, beware forgetting the comma when using var but spanning multiple lines. Somebody found this in my code yesterday:
var srcRecords = src.records
srcIds = [];
It ran but the effect was that the srcIds declaration/assignment was global because the local declaration with var on the previous line no longer applied as that statement was considered finished due to automatic semi-colon insertion.
The most contextual description of JavaScript's Automatic Semicolon Insertion I have found comes from a book about Crafting Interpreters.
JavaScript’s “automatic semicolon insertion” rule is the odd one. Where other languages assume most newlines are meaningful and only a few should be ignored in multi-line statements, JS assumes the opposite. It treats all of your newlines as meaningless whitespace unless it encounters a parse error. If it does, it goes back and tries turning the previous newline into a semicolon to get something grammatically valid.
He goes on to describe it as you would code smell.
This design note would turn into a design diatribe if I went into complete detail about how that even works, much less all the various ways that that is a bad idea. It’s a mess. JavaScript is the only language I know where many style guides demand explicit semicolons after every statement even though the language theoretically lets you elide them.
Just to add,
const foo = function(){ return "foo" } //this doesn't add a semicolon here.
(function (){
console.log("aa");
})()
see this, using immediately invoked function expression(IIFE)
Most statements and declarations in JavaScript must be terminated with a semicolon, however, for the convenience of the programmer (less typing, stylistic preference, less code noise, lower barrier to entry), semicolons may be omitted in some source text locations, with the runtime automatically inserting semicolons according to a set of rules set-out in the spec.
Over-arching rules: a semicolon is never inserted automatically if the semicolon would then be parsed as an empty statement or if that semicolon would become one of the two semicolons in the header of a for statement.
Rule 1
A semicolon will be automatically inserted if a token is encountered by the JavaScript parser that both would not be allowed if a semicolon did not exist, and that token is separated from the previous by one or more line terminators (eg. newlines), a closing brace }, or the final parenthesis ()) of a do-while loop.
In other words: source text locations where statements would always need to be terminated anyway for a runnable program, will have the statement terminator (;) inserted automatically if it is omitted. This rule is the heart of ASI.
Rule 2
A semicolon will be inserted at the end of the program if the source text is not otherwise a valid script or module. In other words: programmers can omit the final semicolon in a program.
Rule 3
A semicolon will be automatically inserted if a token is encountered that would normally be allowed if a semicolon did not exist, but exists within one of several special source text locations (restricted productions) that explicitly disallow line terminators within them for reasons of avoiding ambiguity.
The restricted productions inside of which line terminators are prohibited are:
before postfix ++ and postfix -- (so the unary increment/decrement operators after a newline will bind to the following (not previous) statement, as a prefix operator)
after continue, break, throw, return, yield
after arrow function parameter lists, and
after the async keyword in async function declarations & expressions, generator function declarations & expressions & methods, and async arrow functions
The spec contains the full details, plus the following practical advice:
The resulting practical advice to ECMAScript programmers is:
A postfix ++ or -- operator should be on the same line as its operand.
An Expression in a return or throw statement or an
AssignmentExpression in a yield expression should start on the same
line as the return, throw, or yield token.
A LabelIdentifier in a break or continue statement should be on the same line as the break or continue token.
The end of an arrow function's parameter(s) and its => should be on the same line.
The async token preceding an asynchronous function or method should be on the same line as the immediately following token.
And this is the best article on the Web on this subject.
ASI Gotcha Examples
Starting a line with `(`
The opening parenthesis character has multiple meanings. It can delineate an expression, or it can indicate an invocation (when paired with a closing parenthesis).
For example, the following throws "Uncaught TypeError: console.log(...) is not a function" because the runtime attempts to invoke the return value of console.log('bar'):
let a = 'foo'
console.log('bar')
(a = 'bam')
One solution for this, if you are generally omitting semicolons, is to include a semicolon to make your intentions unambiguous:
let a = 'foo'
console.log('bar')
;(a = 'bam') // note semicolon at start of line
Starting a line with `[`
The opening bracket character ([) has multiple meanings. It can indicate an object property access, or it can indicate the literal declaration of an array (when paired with a closing bracket), or it can indicate an array destructuring.
For example, the following throws "Uncaught TypeError: Cannot set properties of undefined (setting 'foo')" because the runtime attempts to set the value of a property named 'foo' on the response of console.log('bar'):
let a = 'foo'
console.log('bar')
[a] = ['bam']
One solution for this, if you are generally omitting semicolons, is to include a semicolon to make your intentions unambiguous:
let a = 'foo'
console.log('bar')
;[a] = ['bam'] // note semicolon at start of line

Is the ; (semi-colon) mandatory in JavaScript? [duplicate]

Well, first I should probably ask if this is browser dependent.
I've read that if an invalid token is found, but the section of code is valid until that invalid token, a semicolon is inserted before the token if it is preceded by a line break.
However, the common example cited for bugs caused by semicolon insertion is:
return
_a+b;
..which doesn't seem to follow this rule, since _a would be a valid token.
On the other hand, breaking up call chains works as expected:
$('#myButton')
.click(function(){alert("Hello!")});
Does anyone have a more in-depth description of the rules?
First of all you should know which statements are affected by the automatic semicolon insertion (also known as ASI for brevity):
empty statement
var statement
expression statement
do-while statement
continue statement
break statement
return statement
throw statement
The concrete rules of ASI, are described in the specification §11.9.1 Rules of Automatic Semicolon Insertion
Three cases are described:
When an offending token is encountered that is not allowed by the grammar, a semicolon is inserted before it if:
The token is separated from the previous token by at least one LineTerminator.
The token is }
e.g.:
{ 1
2 } 3
is transformed to
{ 1
;2 ;} 3;
The NumericLiteral 1 meets the first condition, the following token is a line terminator.
The 2 meets the second condition, the following token is }.
When the end of the input stream of tokens is encountered and the parser is unable to parse the input token stream as a single complete Program, then a semicolon is automatically inserted at the end of the input stream.
e.g.:
a = b
++c
is transformed to:
a = b;
++c;
This case occurs when a token is allowed by some production of the grammar, but the production is a restricted production, a semicolon is automatically inserted before the restricted token.
Restricted productions:
UpdateExpression :
LeftHandSideExpression [no LineTerminator here] ++
LeftHandSideExpression [no LineTerminator here] --
ContinueStatement :
continue ;
continue [no LineTerminator here] LabelIdentifier ;
BreakStatement :
break ;
break [no LineTerminator here] LabelIdentifier ;
ReturnStatement :
return ;
return [no LineTerminator here] Expression ;
ThrowStatement :
throw [no LineTerminator here] Expression ;
ArrowFunction :
ArrowParameters [no LineTerminator here] => ConciseBody
YieldExpression :
yield [no LineTerminator here] * AssignmentExpression
yield [no LineTerminator here] AssignmentExpression
The classic example, with the ReturnStatement:
return
"something";
is transformed to
return;
"something";
I could not understand those 3 rules in the specs too well -- hope to have something that is more plain English -- but here is what I gathered from JavaScript: The Definitive Guide, 6th Edition, David Flanagan, O'Reilly, 2011:
Quote:
JavaScript does not treat every line break as a semicolon: it usually treats line breaks as semicolons only if it can’t parse the code without the semicolons.
Another quote: for the code
var a
a
=
3 console.log(a)
JavaScript does not treat the second line break as a semicolon because it can continue parsing the longer statement a = 3;
and:
two exceptions to the general rule that JavaScript interprets line breaks as semicolons when it cannot parse the second line as a continuation of the statement on the first line. The first exception involves the return, break, and continue statements
... If a line break appears after any of these words ... JavaScript will always interpret that line break as a semicolon.
... The second exception involves the ++ and −− operators ... If you want to use either of these operators as postfix operators, they must appear on the same line as the expression they apply to. Otherwise, the line break will be treated as a semicolon, and the ++ or -- will be parsed as a prefix operator applied to the code that follows. Consider this code, for example:
x
++
y
It is parsed as x; ++y;, not as x++; y
So I think to simplify it, that means:
In general, JavaScript will treat it as continuation of code as long as it makes sense -- except 2 cases: (1) after some keywords like return, break, continue, and (2) if it sees ++ or -- on a new line, then it will add the ; at the end of the previous line.
The part about "treat it as continuation of code as long as it makes sense" makes it feel like regular expression's greedy matching.
With the above said, that means for return with a line break, the JavaScript interpreter will insert a ;
(quoted again: If a line break appears after any of these words [such as return] ... JavaScript will always interpret that line break as a semicolon)
and due to this reason, the classic example of
return
{
foo: 1
}
will not work as expected, because the JavaScript interpreter will treat it as:
return; // returning nothing
{
foo: 1
}
There has to be no line-break immediately after the return:
return {
foo: 1
}
for it to work properly. And you may insert a ; yourself if you were to follow the rule of using a ; after any statement:
return {
foo: 1
};
Straight from the ECMA-262, Fifth Edition ECMAScript Specification:
7.9.1 Rules of Automatic Semicolon Insertion
There are three basic rules of semicolon insertion:
When, as the program is parsed from left to right, a token (called the offending token) is encountered that is not allowed by any production of the grammar, then a semicolon is automatically inserted before the offending token if one or more of the following conditions is true:
The offending token is separated from the previous token by at least one LineTerminator.
The offending token is }.
When, as the program is parsed from left to right, the end of the input stream of tokens is encountered and the parser is unable to parse the input token stream as a single complete ECMAScript Program, then a semicolon is automatically inserted at the end of the input stream.
When, as the program is parsed from left to right, a token is encountered that is allowed by some production of the grammar, but the production is a restricted production and the token would be the first token for a terminal or nonterminal immediately following the annotation "[no LineTerminator here]" within the restricted production (and therefore such a token is called a restricted token), and the restricted token is separated from the previous token by at least one LineTerminator, then a semicolon is automatically inserted before the restricted token.
However, there is an additional overriding condition on the preceding rules: a semicolon is never inserted automatically if the semicolon would then be parsed as an empty statement or if that semicolon would become one of the two semicolons in the header of a for statement (see 12.6.3).
Regarding semicolon insertion and the var statement, beware forgetting the comma when using var but spanning multiple lines. Somebody found this in my code yesterday:
var srcRecords = src.records
srcIds = [];
It ran but the effect was that the srcIds declaration/assignment was global because the local declaration with var on the previous line no longer applied as that statement was considered finished due to automatic semi-colon insertion.
The most contextual description of JavaScript's Automatic Semicolon Insertion I have found comes from a book about Crafting Interpreters.
JavaScript’s “automatic semicolon insertion” rule is the odd one. Where other languages assume most newlines are meaningful and only a few should be ignored in multi-line statements, JS assumes the opposite. It treats all of your newlines as meaningless whitespace unless it encounters a parse error. If it does, it goes back and tries turning the previous newline into a semicolon to get something grammatically valid.
He goes on to describe it as you would code smell.
This design note would turn into a design diatribe if I went into complete detail about how that even works, much less all the various ways that that is a bad idea. It’s a mess. JavaScript is the only language I know where many style guides demand explicit semicolons after every statement even though the language theoretically lets you elide them.
Just to add,
const foo = function(){ return "foo" } //this doesn't add a semicolon here.
(function (){
console.log("aa");
})()
see this, using immediately invoked function expression(IIFE)
Most statements and declarations in JavaScript must be terminated with a semicolon, however, for the convenience of the programmer (less typing, stylistic preference, less code noise, lower barrier to entry), semicolons may be omitted in some source text locations, with the runtime automatically inserting semicolons according to a set of rules set-out in the spec.
Over-arching rules: a semicolon is never inserted automatically if the semicolon would then be parsed as an empty statement or if that semicolon would become one of the two semicolons in the header of a for statement.
Rule 1
A semicolon will be automatically inserted if a token is encountered by the JavaScript parser that both would not be allowed if a semicolon did not exist, and that token is separated from the previous by one or more line terminators (eg. newlines), a closing brace }, or the final parenthesis ()) of a do-while loop.
In other words: source text locations where statements would always need to be terminated anyway for a runnable program, will have the statement terminator (;) inserted automatically if it is omitted. This rule is the heart of ASI.
Rule 2
A semicolon will be inserted at the end of the program if the source text is not otherwise a valid script or module. In other words: programmers can omit the final semicolon in a program.
Rule 3
A semicolon will be automatically inserted if a token is encountered that would normally be allowed if a semicolon did not exist, but exists within one of several special source text locations (restricted productions) that explicitly disallow line terminators within them for reasons of avoiding ambiguity.
The restricted productions inside of which line terminators are prohibited are:
before postfix ++ and postfix -- (so the unary increment/decrement operators after a newline will bind to the following (not previous) statement, as a prefix operator)
after continue, break, throw, return, yield
after arrow function parameter lists, and
after the async keyword in async function declarations & expressions, generator function declarations & expressions & methods, and async arrow functions
The spec contains the full details, plus the following practical advice:
The resulting practical advice to ECMAScript programmers is:
A postfix ++ or -- operator should be on the same line as its operand.
An Expression in a return or throw statement or an
AssignmentExpression in a yield expression should start on the same
line as the return, throw, or yield token.
A LabelIdentifier in a break or continue statement should be on the same line as the break or continue token.
The end of an arrow function's parameter(s) and its => should be on the same line.
The async token preceding an asynchronous function or method should be on the same line as the immediately following token.
And this is the best article on the Web on this subject.
ASI Gotcha Examples
Starting a line with `(`
The opening parenthesis character has multiple meanings. It can delineate an expression, or it can indicate an invocation (when paired with a closing parenthesis).
For example, the following throws "Uncaught TypeError: console.log(...) is not a function" because the runtime attempts to invoke the return value of console.log('bar'):
let a = 'foo'
console.log('bar')
(a = 'bam')
One solution for this, if you are generally omitting semicolons, is to include a semicolon to make your intentions unambiguous:
let a = 'foo'
console.log('bar')
;(a = 'bam') // note semicolon at start of line
Starting a line with `[`
The opening bracket character ([) has multiple meanings. It can indicate an object property access, or it can indicate the literal declaration of an array (when paired with a closing bracket), or it can indicate an array destructuring.
For example, the following throws "Uncaught TypeError: Cannot set properties of undefined (setting 'foo')" because the runtime attempts to set the value of a property named 'foo' on the response of console.log('bar'):
let a = 'foo'
console.log('bar')
[a] = ['bam']
One solution for this, if you are generally omitting semicolons, is to include a semicolon to make your intentions unambiguous:
let a = 'foo'
console.log('bar')
;[a] = ['bam'] // note semicolon at start of line

How Javascript decides when new line character separates statements? [duplicate]

Well, first I should probably ask if this is browser dependent.
I've read that if an invalid token is found, but the section of code is valid until that invalid token, a semicolon is inserted before the token if it is preceded by a line break.
However, the common example cited for bugs caused by semicolon insertion is:
return
_a+b;
..which doesn't seem to follow this rule, since _a would be a valid token.
On the other hand, breaking up call chains works as expected:
$('#myButton')
.click(function(){alert("Hello!")});
Does anyone have a more in-depth description of the rules?
First of all you should know which statements are affected by the automatic semicolon insertion (also known as ASI for brevity):
empty statement
var statement
expression statement
do-while statement
continue statement
break statement
return statement
throw statement
The concrete rules of ASI, are described in the specification §11.9.1 Rules of Automatic Semicolon Insertion
Three cases are described:
When an offending token is encountered that is not allowed by the grammar, a semicolon is inserted before it if:
The token is separated from the previous token by at least one LineTerminator.
The token is }
e.g.:
{ 1
2 } 3
is transformed to
{ 1
;2 ;} 3;
The NumericLiteral 1 meets the first condition, the following token is a line terminator.
The 2 meets the second condition, the following token is }.
When the end of the input stream of tokens is encountered and the parser is unable to parse the input token stream as a single complete Program, then a semicolon is automatically inserted at the end of the input stream.
e.g.:
a = b
++c
is transformed to:
a = b;
++c;
This case occurs when a token is allowed by some production of the grammar, but the production is a restricted production, a semicolon is automatically inserted before the restricted token.
Restricted productions:
UpdateExpression :
LeftHandSideExpression [no LineTerminator here] ++
LeftHandSideExpression [no LineTerminator here] --
ContinueStatement :
continue ;
continue [no LineTerminator here] LabelIdentifier ;
BreakStatement :
break ;
break [no LineTerminator here] LabelIdentifier ;
ReturnStatement :
return ;
return [no LineTerminator here] Expression ;
ThrowStatement :
throw [no LineTerminator here] Expression ;
ArrowFunction :
ArrowParameters [no LineTerminator here] => ConciseBody
YieldExpression :
yield [no LineTerminator here] * AssignmentExpression
yield [no LineTerminator here] AssignmentExpression
The classic example, with the ReturnStatement:
return
"something";
is transformed to
return;
"something";
I could not understand those 3 rules in the specs too well -- hope to have something that is more plain English -- but here is what I gathered from JavaScript: The Definitive Guide, 6th Edition, David Flanagan, O'Reilly, 2011:
Quote:
JavaScript does not treat every line break as a semicolon: it usually treats line breaks as semicolons only if it can’t parse the code without the semicolons.
Another quote: for the code
var a
a
=
3 console.log(a)
JavaScript does not treat the second line break as a semicolon because it can continue parsing the longer statement a = 3;
and:
two exceptions to the general rule that JavaScript interprets line breaks as semicolons when it cannot parse the second line as a continuation of the statement on the first line. The first exception involves the return, break, and continue statements
... If a line break appears after any of these words ... JavaScript will always interpret that line break as a semicolon.
... The second exception involves the ++ and −− operators ... If you want to use either of these operators as postfix operators, they must appear on the same line as the expression they apply to. Otherwise, the line break will be treated as a semicolon, and the ++ or -- will be parsed as a prefix operator applied to the code that follows. Consider this code, for example:
x
++
y
It is parsed as x; ++y;, not as x++; y
So I think to simplify it, that means:
In general, JavaScript will treat it as continuation of code as long as it makes sense -- except 2 cases: (1) after some keywords like return, break, continue, and (2) if it sees ++ or -- on a new line, then it will add the ; at the end of the previous line.
The part about "treat it as continuation of code as long as it makes sense" makes it feel like regular expression's greedy matching.
With the above said, that means for return with a line break, the JavaScript interpreter will insert a ;
(quoted again: If a line break appears after any of these words [such as return] ... JavaScript will always interpret that line break as a semicolon)
and due to this reason, the classic example of
return
{
foo: 1
}
will not work as expected, because the JavaScript interpreter will treat it as:
return; // returning nothing
{
foo: 1
}
There has to be no line-break immediately after the return:
return {
foo: 1
}
for it to work properly. And you may insert a ; yourself if you were to follow the rule of using a ; after any statement:
return {
foo: 1
};
Straight from the ECMA-262, Fifth Edition ECMAScript Specification:
7.9.1 Rules of Automatic Semicolon Insertion
There are three basic rules of semicolon insertion:
When, as the program is parsed from left to right, a token (called the offending token) is encountered that is not allowed by any production of the grammar, then a semicolon is automatically inserted before the offending token if one or more of the following conditions is true:
The offending token is separated from the previous token by at least one LineTerminator.
The offending token is }.
When, as the program is parsed from left to right, the end of the input stream of tokens is encountered and the parser is unable to parse the input token stream as a single complete ECMAScript Program, then a semicolon is automatically inserted at the end of the input stream.
When, as the program is parsed from left to right, a token is encountered that is allowed by some production of the grammar, but the production is a restricted production and the token would be the first token for a terminal or nonterminal immediately following the annotation "[no LineTerminator here]" within the restricted production (and therefore such a token is called a restricted token), and the restricted token is separated from the previous token by at least one LineTerminator, then a semicolon is automatically inserted before the restricted token.
However, there is an additional overriding condition on the preceding rules: a semicolon is never inserted automatically if the semicolon would then be parsed as an empty statement or if that semicolon would become one of the two semicolons in the header of a for statement (see 12.6.3).
Regarding semicolon insertion and the var statement, beware forgetting the comma when using var but spanning multiple lines. Somebody found this in my code yesterday:
var srcRecords = src.records
srcIds = [];
It ran but the effect was that the srcIds declaration/assignment was global because the local declaration with var on the previous line no longer applied as that statement was considered finished due to automatic semi-colon insertion.
The most contextual description of JavaScript's Automatic Semicolon Insertion I have found comes from a book about Crafting Interpreters.
JavaScript’s “automatic semicolon insertion” rule is the odd one. Where other languages assume most newlines are meaningful and only a few should be ignored in multi-line statements, JS assumes the opposite. It treats all of your newlines as meaningless whitespace unless it encounters a parse error. If it does, it goes back and tries turning the previous newline into a semicolon to get something grammatically valid.
He goes on to describe it as you would code smell.
This design note would turn into a design diatribe if I went into complete detail about how that even works, much less all the various ways that that is a bad idea. It’s a mess. JavaScript is the only language I know where many style guides demand explicit semicolons after every statement even though the language theoretically lets you elide them.
Just to add,
const foo = function(){ return "foo" } //this doesn't add a semicolon here.
(function (){
console.log("aa");
})()
see this, using immediately invoked function expression(IIFE)
Most statements and declarations in JavaScript must be terminated with a semicolon, however, for the convenience of the programmer (less typing, stylistic preference, less code noise, lower barrier to entry), semicolons may be omitted in some source text locations, with the runtime automatically inserting semicolons according to a set of rules set-out in the spec.
Over-arching rules: a semicolon is never inserted automatically if the semicolon would then be parsed as an empty statement or if that semicolon would become one of the two semicolons in the header of a for statement.
Rule 1
A semicolon will be automatically inserted if a token is encountered by the JavaScript parser that both would not be allowed if a semicolon did not exist, and that token is separated from the previous by one or more line terminators (eg. newlines), a closing brace }, or the final parenthesis ()) of a do-while loop.
In other words: source text locations where statements would always need to be terminated anyway for a runnable program, will have the statement terminator (;) inserted automatically if it is omitted. This rule is the heart of ASI.
Rule 2
A semicolon will be inserted at the end of the program if the source text is not otherwise a valid script or module. In other words: programmers can omit the final semicolon in a program.
Rule 3
A semicolon will be automatically inserted if a token is encountered that would normally be allowed if a semicolon did not exist, but exists within one of several special source text locations (restricted productions) that explicitly disallow line terminators within them for reasons of avoiding ambiguity.
The restricted productions inside of which line terminators are prohibited are:
before postfix ++ and postfix -- (so the unary increment/decrement operators after a newline will bind to the following (not previous) statement, as a prefix operator)
after continue, break, throw, return, yield
after arrow function parameter lists, and
after the async keyword in async function declarations & expressions, generator function declarations & expressions & methods, and async arrow functions
The spec contains the full details, plus the following practical advice:
The resulting practical advice to ECMAScript programmers is:
A postfix ++ or -- operator should be on the same line as its operand.
An Expression in a return or throw statement or an
AssignmentExpression in a yield expression should start on the same
line as the return, throw, or yield token.
A LabelIdentifier in a break or continue statement should be on the same line as the break or continue token.
The end of an arrow function's parameter(s) and its => should be on the same line.
The async token preceding an asynchronous function or method should be on the same line as the immediately following token.
And this is the best article on the Web on this subject.
ASI Gotcha Examples
Starting a line with `(`
The opening parenthesis character has multiple meanings. It can delineate an expression, or it can indicate an invocation (when paired with a closing parenthesis).
For example, the following throws "Uncaught TypeError: console.log(...) is not a function" because the runtime attempts to invoke the return value of console.log('bar'):
let a = 'foo'
console.log('bar')
(a = 'bam')
One solution for this, if you are generally omitting semicolons, is to include a semicolon to make your intentions unambiguous:
let a = 'foo'
console.log('bar')
;(a = 'bam') // note semicolon at start of line
Starting a line with `[`
The opening bracket character ([) has multiple meanings. It can indicate an object property access, or it can indicate the literal declaration of an array (when paired with a closing bracket), or it can indicate an array destructuring.
For example, the following throws "Uncaught TypeError: Cannot set properties of undefined (setting 'foo')" because the runtime attempts to set the value of a property named 'foo' on the response of console.log('bar'):
let a = 'foo'
console.log('bar')
[a] = ['bam']
One solution for this, if you are generally omitting semicolons, is to include a semicolon to make your intentions unambiguous:
let a = 'foo'
console.log('bar')
;[a] = ['bam'] // note semicolon at start of line

Categories

Resources