This is probably JS 101 but...
Can someone with a better understanding of JS engines explain to me why string literals, integers and so forth are 'ignored' by JS or treated as otherwise valid code?
JS Hint does give 'unexpected expression' reports however the code remains valid and runs.
I've created the following pen to, hopefully, explain what I mean here.
http://codepen.io/billythekid/pen/zyGbi/
var span = document.getElementsByTagName('SPAN')[0];
// let's show what I'm trying to say here by way of an expanded example.
function foo()
{
var bar = "something";
}
foo(); // does nothing useful, but returns nothing either - valid and understandable;
function baz()
{
return "nothing"; // a string
}
function zoo()
{
return 250; // an integer
}
var a = baz(); // the variable holds the return value. The function is evaluated and the return value is assigned to the variable.
span.innerHTML += a+"<br>";
baz(); // produces no error despite the function returning a value. Why doesn't the JS engine see this as the evaluated string "something" and try to run the string as a JS command?
span.innerHTML += "this code has run, so the JS didn't break above. Why wasn't the returned string parsed and invalid?<br>";
"something"; // the string literal
span.innerHTML += "this code has run, so the JS didn't break above. So why not? How is a string literal valid JS? Why no errors?<br>";
var b = zoo();
span.innerHTML += b+"<br>";
zoo();// produces no error despite the function returning a value. Why doesn't the JS engine see this as the evaluated integer 250 and try to run the string as a JS command?
span.innerHTML += "this code has run, so the JS didn't break above. So why not? How is an evaluated integer valid JS? Why no errors?<br>";
250; // the integer literal
span.innerHTML += "this code has run, so the JS didn't break above. So why not? How is an integer literal valid JS? Why no errors?<br>";
eval(250); // the integer literal
span.innerHTML += "this code has run, so the JS didn't break above. So why not? How is an evaluated integer literal valid JS? Why no errors?<br>";
eval("something"); // the string literal
span.innerHTML += "this code broke, it can't run a string that's not been defined as a function or whatever.<br>";
// and, had the previous code not broken...
non_func(); // doesn't exist!
span.innerHTML += "this code doesn't run! So it'll error out with a call to a function/variable that doesn't exist but not eval'd code that isn't JS, such as a string, or even the literals of these objects!<br>";
// It appears that anythign not explicitly wrapped in an eval function is ignored by JS rather than throwing any errors. Why is this?
Simply running a string literal such as "foo"; as a line in the console seems to return itself.
Is the JS internally wrapping simple cases like these in some sort of 'noop' method or internally garbage-collecting such things or does it simply see the code as "run" once it's gone past it and has nothing more to do (such as assign values to a variable, or some other thing?
I got to thinking about this when using a setInterval() call, if I assign it's return value (well, it's ID identifier) to a var for using in a clearInterval later it's valid but it's also valid when we ignore the returned ID. The ID isn't "parsed" as JS.
Using strict mode seems to have no effect on this behaviour either.
Hopefully I've not made this more confusing than it needs to be. :oD
One big culprit behind your confusion is the C programming language. In it, many things that you think are statements, such as assignments, are actually expressions.
//This is valid C and Javascript code
x = (y += 1);
//We all have been bitten by this one once haven't we?
while(x = y){
}
and in order to let these statements be used in lines of their own, there is a rule in the language grammar that lets an expression followed by a semicolon to count as a statement
stmt :=
<if_then_else> |
<while loop> |
... |
<expression> ';'
The rule for evaluating these single-expression statements is that the expression is evaluated for side-effects and its value gets ignored.
Javascript (and lots of other languages) adopted lots of things from the C syntax, including this particular treatment of expressions. Using a string as a statement does nothing with its value (unless the string is "use strict" - then its a useful hack that does something in new browsers but nothing in the old ones). Using a function call as a statement runs it for side effects and ignores its return value. Some more stingy languages, such as Haskell or Go will complain if you ignore return values but C and Javascript will just throw them away.
-- In Haskell you get a compiler warning for ignoring return values
do
functionThatReturnsNothing()
a <- functionThatReturnsAValue()
_ <- functionThatReturnsAValue() -- unless you ignore it explicitly
It appears that anythign not explicitly wrapped in an eval function is ignored by JS rather than throwing any errors.
Thank god its that way! It would be crazy error prone if things got implicitly eval-ed like that and strings should never be run as code unless you are explicit about it!
Related
This is a pretty common and useful practice:
// default via value
var un = undefined
var v1 = un || 1
// default via a function call
var myval = () => 1
var v2 = un || myval()
But it doesn't work (SyntaxError) when throwing an error:
var v3 = un || throw new Error('un is not set!')
Is there a way how to achieve the same effect in a similarly elegant way?
This is IMHO a lot of boilerplate code:
if (!un) {
throw new Error('un is not set!')
}
var v3 = un
Or is there any theoretical obstruction, why this is not, and never will be, possible?
throw is a statement only; it may not exist in a position where an expression is required. For similar reasons, you can't put an if statement there, for example
var something = false || if (cond) { /* something */ }
is invalid syntax as well.
Only expressions (things that evaluate to a value) are permitted to be assigned to variables. If you want to throw, you have to throw as a statement, which means you can't put it on the right-hand side of an assignment.
I suppose one way would be to use an IIFE on the right-hand side of the ||, allowing you to use a statement on the first line of that function:
var un = undefined
var v2 = un || (() => { throw new Error('nope') })();
But that's pretty weird. I'd prefer the explicit if - throw.
Your problem is that an assignment expects an expression but you give it a statement
The Syntax for initializing/assigning a variable is:
var|let|const <variableName> = <expression>
but you use
var|let|const <variableName> = <statement>
which is invalid Syntax.
Expressions
An expression is something that produces a value.
What is a "value"?
A value is anything that is a type in Javascript
Numbers
Strings
Booleans
Objects
Arrays
Symbols
Examples for Expressions:
Literals
var x = 5;
x is assigned the value "5"
A function call
var x = myFunc();
myFunc() produces a value that is assigned to x
The produced value of a function is its return value - A function always returns, and if it doesn't explicitly, it returns undefined.
Functions have the added benefit of being able to contain statements in their body - Which will be the solution to your question - But more on that later.
Statements
A statement is something that performs an action. For Example:
A loop
for (var i = 0; i < 10; i++) { /* loop body */ }
This loop performs the action of executing the loop body 10 times
Throwing an error
throw new Error()
Unwinds the stack and stops the execution of the current frame
So why can't we mix both?
When you want to assign to a variable, you want an expression because you want the variable to have a value.
If you think about it, it should be clear that it will never work with a statement. Giving a variable an "action" is nonsense. What is that even supposed to mean?
Therefore you cannot use the throw statement since it does not produce a value.
You can only have one or the other.
Either you are (expression) something or you do (statement) something.
A fix
You can convert any statement into an expression by wrapping it in a function, I suggest using an IIFE (Immediately invoked function expression) - basically a function that invokes itself - to do just that
var x = 5 || (() => throw new Error())()
This works because the right side is now a function and a function is an expression which produces a value, The value is undefined in this case, but since we stop executing it doesnt matter anyways.
Future Possibilities
Technically there is nothing that prevents this from working.
Many languages (c++, ...) actually already treat throw as an expression. Some (kotlin, ...) even leave out statements completely and treat everything as an expression.
Others (c#, php, ...) provide workarounds like the ?? null-concealing or ?. elvis operator to solve this very use case.
Maybe in the future we get one of those features into the ecmascript standard (there is even an open proposal to include this) until then your best bet is to use a function like:
function assertPresent(value, message)
{
if(!value) {
throw new Error(message);
} else {
return value;
}
}
You could move the throwing of the exception into a function, because throw is a statement of control flow, and not an expression:
An expression is any valid unit of code that resolves to a value.
const throwError = function (e) { throw new Error(e); };
var un = undefined,
v3 = un || throwError('un is not set!');
As other answers have stated, it is because throw is a statement, which can't be used in contexts which expect expressions, such as on the right side of a ||. As stated by others, you can get around that by wrapping the exception in a function and immediately calling it, but I'm going to make the case that doing so is a bad idea because it makes your intent less clear. Three extra lines of code is not a big deal for making the intent of your code very clear and explicit. I personally think that throw being statement-only is a good thing because it encourages writing more straightforward code that is less likely to cause other developers to scratch their heads when encountering your code.
The || defaulting idiom is useful when you want to provide default or alternative values for undefined, null, and other falsy values, but I think it loses a lot of its clarity when used in a branching sense. By "branching sense", I mean that if your intent is to do something if a condition holds (the doing something in this case being throwing an exception), then condition || do_something() is really not a clear way to express that intent even though it is functionally identical to if (!condition) {do_something()}. Short-circuit evaluation isn't immediately obvious to every developer and || defaulting is only understood because it's a commonly-used idiom in Javascript.
My general rule of thumb is that if a function has side effects (and yes, exceptions count as side effects, especially since they're basically non-local goto statements), you should use an if statement for its condition rather than || or &&. You're not golfing.
Bottom line: which is going to cause less confusion?
return value || (() => {throw new Error('an error occurred')})()
or
if (!value) {
throw new Error('an error occurred')
}
return value
It's usually worth it to sacrifice terseness for clarity.
Like others have said the problem is that throw is a statement and not an expression.
There is however really no need for this dichotomy. There are languages where everything is an expression (no statements) and they're not "inferior" because of this; it simplifies both syntax and semantic (e.g. you don't need separate if statements and the ternary operator ?:).
Actually this is just one of the many reasons for which Javascript (the language) kind of sucks, despite Javascript (the runtime environment) being amazing.
A simple work-around (that can be used also in other languages with a similar limitation like Python) is:
function error(x) { throw Error(x); }
then you can simply write
let x = y.parent || error("No parent");
There is some complexity in having throw as an expression for statically typed languages: what should be the static type of x() ? y() : throw(z)?; for example C++ has a very special rule for handling a throw expression in the ternary operator (the type is taken from the other branch, even if formally throw x is considered an expression of type void).
I found a typo in one of my javascript files where there was an integer just sitting on a line on its own with nothing else. The file never errored out because of it so I was wondering, how does javascript handle or evaluate a line with just an integer on it and nothing else 'behind the scenes'?
The following is a valid statement:
4;
Of course, it would make more sense to do something like var num = 4;, but what you're doing above is the same thing, you're just not saving it's return value to a variable.
You can even have completely empty statements. The following is a valid, empty statement:
;
So you could have a program that looks like this:
var num = 4;
4;
;
Each of those lines would be valid.
Well, "5" is a code, that returns 5. You can try it out by opening Chrome Dev Tools (Press F12), then open console and enter a number. It will return a number, so it is valid piece of code.
You can write any expression in a line, in fact, it is referenced an ExpressionStatement in the grammar. So the line 4; is a statement, which evaluates, but never sotres it's variable; You can even leave the semicolon in most cases because of Automatic Semicolon Insertion.
These statements get evaluated, but then the resulting value is ignored, meaning that single number or string literals will not have any side effect (like 1+2+3;). However, calling functions or accessing variables or fields can have side effects (like: 1+a()+b), a gets accessed and called, b gets accessed).
But there are special cases, like "use strict";, which old engines just skip over as it is just a StringLiteral inside an ExpressionStatement, but modern browsers notice this line, and switch to strict mode.
Why do I get the following error...
Uncaught TypeError: object is not a function
...at the following line, of a certain JS script?
(function($){
And why do I get that error only when JS are concatenated? (I'm using Gulp)
And why does it work if I add ; before that line, like that:
;(function($){
?
update
The preceding line - that is, the object which is not a function, according to the runtime error - on the concatened script was a }, as in:
storage = {
//...
}
I'm used to always put semicolon, but not after curly braces.
Turns out the curly braces could delimit the end of a statement, like in this case, and then it's recommended to use the semicolon to avoid this error. Here's a good explanation.
Javascript ignore missing semi-colon and try to interpret it. So if you don't input the semi-colon, it use the next line to see if it should end the line or chain it.
That allow you to use thing like this :
String
.split();
and it will be interpreted like that :
String.split();
But, this would also work :
String
.split
();
Now, If you have something like this :
var a = 'a';
var b = a
(function(){})
JavaScript has no way to know what you really want to do, so it will interpret it like that :
var a = 'a';
var b = a(function(){});
Giving you the error [place object type here] is not a function
Bottom line, always put your semi-colon.
Edit
After seeing your code, here how it is interpreted :
storage = {/**/}(function($){})(jQuery);
So Object ({} === Object) is not a function
When concatenated it believes you're trying to call whatever precedes the (function($) {...}.
If you put () after a reference it tries to call whatever the reference is. This is why you'll see a lot of JavaScript libraries precede their code with a lone ;
Looking at these articles from Mozilla's JavaScript guide:
Expressions
Statements
expressions are also considered assignment statements. In fact, in the second article one can read "any expression is also a statement". Being acquainted with other programming languages, I thought that expressions are always values, but they never cause side effects like statements would do. In other words, 7, 7 + 8, "string", etc., are expressions, because they don't change the state, but a = 7 is a statement, since a variable has now been defined (i.e. a state has changed).
Why would Mozilla not differentiate between the two in JS?
I believe you are taking the terms "expression" and "statement" too literally. "Expressions not changing any state" is a very tough requirement for a programming language.
A thought experiment: In 7 + 8 substitute 8 with a function call to
var globalVar = 0;
function my8() {
globalVar = globalVar + 1;
return 8;
}
Is 7 + my8() a statement or an expression? There is no obvious state change happing here, but still my8 performs a state change. Using the "no side-effects" definition it would be impossible to decide if 7 + my8() is a statement or an expression without analyzing the code of the my8 function. Of course it would be possible to simply prohibit any state change as part of a function call, but that is not the way of JavaScript.
In my experience most languages define "everything which returns a value" as an expression and a statement, everything else as just a statement.
To answer your question "Why would Mozilla not differentiate between the two in JS?":
I think they do, but not in the manner you expected. To consider "everything which returns a value" an expression seems to be the most practical approach.
Also there is no contradiction between a chunk of code being a statement and an expression at the same time. That is simply how Javascript and many other languages work. Of course it is always possible to draw a more strict line between those two.
Examples:
Assignments return values, so this is possible:
a = b = c = 1;
It can be written in the more obvious form:
a = (b = (c = 1));
Because of that an assignment is considered an expression (and also a statement).
On the other hand:
if (true) { };
does not return a value (in Javascript!) and therefore is no expression (but still a statement).
An expression is a code fragment that returns some value, Expression (Computer Science):
3; // 3
{}; // Object
func(); // whatever func returns, or undefined if not specified
You can combine expressions into one compound expression:
3 + 7; // 10
{}, []; // Array. Comma operator always returns result of right-most expression
A statement is the smallest valid code fragment that can be compiled or interpreted, Statement (Computer Science):
5; // valid js
You can also combine statements into compound statements:
check || func(); // valid js
{
4 + 9;
"block statement";
}
In the Mozilla documentation, a statement refers to any (compound) statement that is explicitly or implicitly terminated by semi-colon (;).
[,,[],[,[,,],,]]; // Array declaration whose reference is returned (and ignored)
// Multi-dimensional array with empty (undefined) elements
In some programming languages the above example doesn't compile or doesn't get interpreted. Other languages might not allow for the result of an expression not to be catched.
Javascript is very expressive, which is why every expression counts as a valid statement. Some statements are not expressions, like break, return, while, etc. They don't return any value, but they control the program execution flow.
Mozilla does differentiate between the two, or rather the Javascript syntax does.
The only slightly "special" about Javascript is the following:
"any expression is also a statement",
which means that at places where a statement is required in the syntax, an expression can be used directly (but not the other way around). E.g. the following is valid Javascript but invalid in many other similar languages:
if (true) "asfd"
or
foo = function(){
if (5) {
"some text here that won't do anything";
return true;
42; // always good to have that one here!
}
}
whereas statements cannot be used as expressions:
a = (if (true) 5) // does not work "unexpected token 'if'"
They used that "feature" for the strict mode specification without introducing a new keyword or syntax - if you add the expression "use strict" as the first statement in a function body, Javascript is executed in strict mode in supporting browsers.
While expressions evaluate to a value, usually, statements do not. Most statements alter control flow, expressions usually don't (although one could argue that an expression that results in an exception being thrown alters control flow, too).
In Javascript expressions form a subset of all statements.
var a = [1, 2, 3, 4];
var b = [10, 20, 30, 40];
console.log([a, b].length)
[a, b].some(function(x) {
x.push(x.shift())
});
I was extremely surprised today when this code caused
[a,b].some(function(x){ x.push(x.shift()) });
^
TypeError: Cannot call method 'some' of undefined
Obviously the JavaScript 'auto semicolon insertion' is not working as expected here. But why?
I know you might recommend to use ; everywhere to avoid something like that, but the question is not about whether it is better to use ; or not. I would love to know what exactly happens here?
When I'm worried about semicolon insertion, I think about what the lines in question would look like without any whitespace between them. In your case, that would be:
console.log([a,b].length)[a,b].some(function(x){ etc });
Here you're telling the Javascript engine to call console.log with the length of [a,b], then to look at index [a,b] of the result of that call.
console.log returns a string, so your code will attempt to find property b of that string, which is undefined, and the call to undefined.some() fails.
It's interesting to note that str[a,b] will resolve to str[b] assuming str is a string. As Kamil points out, a,b is a valid Javascript expression, and the result of that expression is simply b.
In general, one could say that implicit semi-colon's can easily fail when defining an array on a new line, because an array defined on a new line is interpreted as a property access of the value of the expression on the previous line.
Javascript does only consider new lines to mark the end of a statement if not ending the statement after this new line would cause a parse error. See What are the rules for JavaScript's automatic semicolon insertion (ASI)? and EcmaScript 5 spec for the exact rules. (Thanks to Rob W and limelights)
What happens is the following:
The code get interpreted as
console.log([a,b].length)[a,b].some(function(x){ x.push(x.shift()) });
i.e. all as one statement.
Now parse the statement:
some is called on the value of console.log([a,b].length)[a,b]
the value of console.log([a,b].length)[a,b] is computed by taking the returned value of console.log([a,b].length) (undefined) and then trying to access the property with the name of the value of a,b.
a,b evaluates to the value of b (try it in your console). There's no property with the value of b of undefined, so the resulting value will be undefined as well.
There's no method some on undefined, hence the error.
JavaScript doesn't treat every line break as a semicolon. It usually treats line
breaks as semicolons only if it can’t parse the code without the semicolons. Basically, JavaScript treats a line break as a semicolon if the next non-space character cannot be interpreted as a continuation of the current statement. JavaScript - The Definitive Guide: 6th Ed. section 2.4
So, in your case, it is interpreting the line as something like
console.log([a,b].length)[a,b].some(function(x){ x.push(x.shift()) });
And that is the reason for error. JavaScript is trying to perform array-access on the results of console.log([a,b].length). Depending on the JavaScript engine and the return value of console.log, you might get different errors.
If it is the last statement of the function or flow, you can avoid ';' but it is recommended to put ';' at the end of the each statement to avoid such error.