I found a typo in one of my javascript files where there was an integer just sitting on a line on its own with nothing else. The file never errored out because of it so I was wondering, how does javascript handle or evaluate a line with just an integer on it and nothing else 'behind the scenes'?
The following is a valid statement:
4;
Of course, it would make more sense to do something like var num = 4;, but what you're doing above is the same thing, you're just not saving it's return value to a variable.
You can even have completely empty statements. The following is a valid, empty statement:
;
So you could have a program that looks like this:
var num = 4;
4;
;
Each of those lines would be valid.
Well, "5" is a code, that returns 5. You can try it out by opening Chrome Dev Tools (Press F12), then open console and enter a number. It will return a number, so it is valid piece of code.
You can write any expression in a line, in fact, it is referenced an ExpressionStatement in the grammar. So the line 4; is a statement, which evaluates, but never sotres it's variable; You can even leave the semicolon in most cases because of Automatic Semicolon Insertion.
These statements get evaluated, but then the resulting value is ignored, meaning that single number or string literals will not have any side effect (like 1+2+3;). However, calling functions or accessing variables or fields can have side effects (like: 1+a()+b), a gets accessed and called, b gets accessed).
But there are special cases, like "use strict";, which old engines just skip over as it is just a StringLiteral inside an ExpressionStatement, but modern browsers notice this line, and switch to strict mode.
Related
var test = 1
console.log(test)
Try running this simple code. It gives error: ReferenceError: test is not defined, despite the fact that I defined that variable. Why is this happening?
In the variable declaration, the variable name contains a zero-width non-joiner (ZWNJ) character (between e and s), which is invisible, because its width is equal to zero. However, ECMAScript specification allows this character as a part of variable name.
However, in the console.log() call, there is just test, without any special characters. Therefore, it throws Reference Error, because the variable name is te<ZWNJ>st, not test.
Fortunately, there's an easy way to check if a variable name contains such characters. You can paste your code into JS Bin or JS Fiddle — they denote these characters with a white dot on a red background. That's how it looks like in JS Fiddle:
I think there are also similar features in some IDEs.
Side note: this is an interesting way to prevent people from copy pasting the code snippets you use in answers into their own code. Consider the following code snippet:
// Warning: non-copy-pastable, it won't work if you copy it into your code.
function add(a, b) {
return a + b
}
console.log(add(2, 3))
There's a ZWNJ character in the function name and the function call, so it works here. However, if someone copied the function into their code and then manually typed console.log(add(3, 4)), it would throw ReferenceError: add is not defined.
Please don't take the above seriously, it's rather a joke than a practical use.
Related
What characters are valid for JavaScript variable names?
No visible cause for “Unexpected token ILLEGAL”
This is probably JS 101 but...
Can someone with a better understanding of JS engines explain to me why string literals, integers and so forth are 'ignored' by JS or treated as otherwise valid code?
JS Hint does give 'unexpected expression' reports however the code remains valid and runs.
I've created the following pen to, hopefully, explain what I mean here.
http://codepen.io/billythekid/pen/zyGbi/
var span = document.getElementsByTagName('SPAN')[0];
// let's show what I'm trying to say here by way of an expanded example.
function foo()
{
var bar = "something";
}
foo(); // does nothing useful, but returns nothing either - valid and understandable;
function baz()
{
return "nothing"; // a string
}
function zoo()
{
return 250; // an integer
}
var a = baz(); // the variable holds the return value. The function is evaluated and the return value is assigned to the variable.
span.innerHTML += a+"<br>";
baz(); // produces no error despite the function returning a value. Why doesn't the JS engine see this as the evaluated string "something" and try to run the string as a JS command?
span.innerHTML += "this code has run, so the JS didn't break above. Why wasn't the returned string parsed and invalid?<br>";
"something"; // the string literal
span.innerHTML += "this code has run, so the JS didn't break above. So why not? How is a string literal valid JS? Why no errors?<br>";
var b = zoo();
span.innerHTML += b+"<br>";
zoo();// produces no error despite the function returning a value. Why doesn't the JS engine see this as the evaluated integer 250 and try to run the string as a JS command?
span.innerHTML += "this code has run, so the JS didn't break above. So why not? How is an evaluated integer valid JS? Why no errors?<br>";
250; // the integer literal
span.innerHTML += "this code has run, so the JS didn't break above. So why not? How is an integer literal valid JS? Why no errors?<br>";
eval(250); // the integer literal
span.innerHTML += "this code has run, so the JS didn't break above. So why not? How is an evaluated integer literal valid JS? Why no errors?<br>";
eval("something"); // the string literal
span.innerHTML += "this code broke, it can't run a string that's not been defined as a function or whatever.<br>";
// and, had the previous code not broken...
non_func(); // doesn't exist!
span.innerHTML += "this code doesn't run! So it'll error out with a call to a function/variable that doesn't exist but not eval'd code that isn't JS, such as a string, or even the literals of these objects!<br>";
// It appears that anythign not explicitly wrapped in an eval function is ignored by JS rather than throwing any errors. Why is this?
Simply running a string literal such as "foo"; as a line in the console seems to return itself.
Is the JS internally wrapping simple cases like these in some sort of 'noop' method or internally garbage-collecting such things or does it simply see the code as "run" once it's gone past it and has nothing more to do (such as assign values to a variable, or some other thing?
I got to thinking about this when using a setInterval() call, if I assign it's return value (well, it's ID identifier) to a var for using in a clearInterval later it's valid but it's also valid when we ignore the returned ID. The ID isn't "parsed" as JS.
Using strict mode seems to have no effect on this behaviour either.
Hopefully I've not made this more confusing than it needs to be. :oD
One big culprit behind your confusion is the C programming language. In it, many things that you think are statements, such as assignments, are actually expressions.
//This is valid C and Javascript code
x = (y += 1);
//We all have been bitten by this one once haven't we?
while(x = y){
}
and in order to let these statements be used in lines of their own, there is a rule in the language grammar that lets an expression followed by a semicolon to count as a statement
stmt :=
<if_then_else> |
<while loop> |
... |
<expression> ';'
The rule for evaluating these single-expression statements is that the expression is evaluated for side-effects and its value gets ignored.
Javascript (and lots of other languages) adopted lots of things from the C syntax, including this particular treatment of expressions. Using a string as a statement does nothing with its value (unless the string is "use strict" - then its a useful hack that does something in new browsers but nothing in the old ones). Using a function call as a statement runs it for side effects and ignores its return value. Some more stingy languages, such as Haskell or Go will complain if you ignore return values but C and Javascript will just throw them away.
-- In Haskell you get a compiler warning for ignoring return values
do
functionThatReturnsNothing()
a <- functionThatReturnsAValue()
_ <- functionThatReturnsAValue() -- unless you ignore it explicitly
It appears that anythign not explicitly wrapped in an eval function is ignored by JS rather than throwing any errors.
Thank god its that way! It would be crazy error prone if things got implicitly eval-ed like that and strings should never be run as code unless you are explicit about it!
Looking at these articles from Mozilla's JavaScript guide:
Expressions
Statements
expressions are also considered assignment statements. In fact, in the second article one can read "any expression is also a statement". Being acquainted with other programming languages, I thought that expressions are always values, but they never cause side effects like statements would do. In other words, 7, 7 + 8, "string", etc., are expressions, because they don't change the state, but a = 7 is a statement, since a variable has now been defined (i.e. a state has changed).
Why would Mozilla not differentiate between the two in JS?
I believe you are taking the terms "expression" and "statement" too literally. "Expressions not changing any state" is a very tough requirement for a programming language.
A thought experiment: In 7 + 8 substitute 8 with a function call to
var globalVar = 0;
function my8() {
globalVar = globalVar + 1;
return 8;
}
Is 7 + my8() a statement or an expression? There is no obvious state change happing here, but still my8 performs a state change. Using the "no side-effects" definition it would be impossible to decide if 7 + my8() is a statement or an expression without analyzing the code of the my8 function. Of course it would be possible to simply prohibit any state change as part of a function call, but that is not the way of JavaScript.
In my experience most languages define "everything which returns a value" as an expression and a statement, everything else as just a statement.
To answer your question "Why would Mozilla not differentiate between the two in JS?":
I think they do, but not in the manner you expected. To consider "everything which returns a value" an expression seems to be the most practical approach.
Also there is no contradiction between a chunk of code being a statement and an expression at the same time. That is simply how Javascript and many other languages work. Of course it is always possible to draw a more strict line between those two.
Examples:
Assignments return values, so this is possible:
a = b = c = 1;
It can be written in the more obvious form:
a = (b = (c = 1));
Because of that an assignment is considered an expression (and also a statement).
On the other hand:
if (true) { };
does not return a value (in Javascript!) and therefore is no expression (but still a statement).
An expression is a code fragment that returns some value, Expression (Computer Science):
3; // 3
{}; // Object
func(); // whatever func returns, or undefined if not specified
You can combine expressions into one compound expression:
3 + 7; // 10
{}, []; // Array. Comma operator always returns result of right-most expression
A statement is the smallest valid code fragment that can be compiled or interpreted, Statement (Computer Science):
5; // valid js
You can also combine statements into compound statements:
check || func(); // valid js
{
4 + 9;
"block statement";
}
In the Mozilla documentation, a statement refers to any (compound) statement that is explicitly or implicitly terminated by semi-colon (;).
[,,[],[,[,,],,]]; // Array declaration whose reference is returned (and ignored)
// Multi-dimensional array with empty (undefined) elements
In some programming languages the above example doesn't compile or doesn't get interpreted. Other languages might not allow for the result of an expression not to be catched.
Javascript is very expressive, which is why every expression counts as a valid statement. Some statements are not expressions, like break, return, while, etc. They don't return any value, but they control the program execution flow.
Mozilla does differentiate between the two, or rather the Javascript syntax does.
The only slightly "special" about Javascript is the following:
"any expression is also a statement",
which means that at places where a statement is required in the syntax, an expression can be used directly (but not the other way around). E.g. the following is valid Javascript but invalid in many other similar languages:
if (true) "asfd"
or
foo = function(){
if (5) {
"some text here that won't do anything";
return true;
42; // always good to have that one here!
}
}
whereas statements cannot be used as expressions:
a = (if (true) 5) // does not work "unexpected token 'if'"
They used that "feature" for the strict mode specification without introducing a new keyword or syntax - if you add the expression "use strict" as the first statement in a function body, Javascript is executed in strict mode in supporting browsers.
While expressions evaluate to a value, usually, statements do not. Most statements alter control flow, expressions usually don't (although one could argue that an expression that results in an exception being thrown alters control flow, too).
In Javascript expressions form a subset of all statements.
var a = [1, 2, 3, 4];
var b = [10, 20, 30, 40];
console.log([a, b].length)
[a, b].some(function(x) {
x.push(x.shift())
});
I was extremely surprised today when this code caused
[a,b].some(function(x){ x.push(x.shift()) });
^
TypeError: Cannot call method 'some' of undefined
Obviously the JavaScript 'auto semicolon insertion' is not working as expected here. But why?
I know you might recommend to use ; everywhere to avoid something like that, but the question is not about whether it is better to use ; or not. I would love to know what exactly happens here?
When I'm worried about semicolon insertion, I think about what the lines in question would look like without any whitespace between them. In your case, that would be:
console.log([a,b].length)[a,b].some(function(x){ etc });
Here you're telling the Javascript engine to call console.log with the length of [a,b], then to look at index [a,b] of the result of that call.
console.log returns a string, so your code will attempt to find property b of that string, which is undefined, and the call to undefined.some() fails.
It's interesting to note that str[a,b] will resolve to str[b] assuming str is a string. As Kamil points out, a,b is a valid Javascript expression, and the result of that expression is simply b.
In general, one could say that implicit semi-colon's can easily fail when defining an array on a new line, because an array defined on a new line is interpreted as a property access of the value of the expression on the previous line.
Javascript does only consider new lines to mark the end of a statement if not ending the statement after this new line would cause a parse error. See What are the rules for JavaScript's automatic semicolon insertion (ASI)? and EcmaScript 5 spec for the exact rules. (Thanks to Rob W and limelights)
What happens is the following:
The code get interpreted as
console.log([a,b].length)[a,b].some(function(x){ x.push(x.shift()) });
i.e. all as one statement.
Now parse the statement:
some is called on the value of console.log([a,b].length)[a,b]
the value of console.log([a,b].length)[a,b] is computed by taking the returned value of console.log([a,b].length) (undefined) and then trying to access the property with the name of the value of a,b.
a,b evaluates to the value of b (try it in your console). There's no property with the value of b of undefined, so the resulting value will be undefined as well.
There's no method some on undefined, hence the error.
JavaScript doesn't treat every line break as a semicolon. It usually treats line
breaks as semicolons only if it can’t parse the code without the semicolons. Basically, JavaScript treats a line break as a semicolon if the next non-space character cannot be interpreted as a continuation of the current statement. JavaScript - The Definitive Guide: 6th Ed. section 2.4
So, in your case, it is interpreting the line as something like
console.log([a,b].length)[a,b].some(function(x){ x.push(x.shift()) });
And that is the reason for error. JavaScript is trying to perform array-access on the results of console.log([a,b].length). Depending on the JavaScript engine and the return value of console.log, you might get different errors.
If it is the last statement of the function or flow, you can avoid ';' but it is recommended to put ';' at the end of the each statement to avoid such error.
It's almost midnight and I just got a question in my head is "for loop" a statement or a function.
I always thought it is a statement, but I did a google search on it being a function and there are indeed results for that. So what is it? And in that case what is the difference between function and statement?
A for loop is a not usually a function, it is a special kind of statement called a flow control structure.
A statement is a command. It does something. In most languages, statements do not return values. Example:
print "Hello World"
A function is a subroutine that can be called elsewhere in the program. Functions often (but not necessarily) return values. Example:
function(a) { return a * 2 }
A control structure, also known as a compound statement, is a statement that is used to direct the flow of execution. Examples:
if (condition) then { branch_1 } else { branch_2 }
for (i = 0; i < 10; i += 1) { ... }
Also worth noting is that an expression is a piece of code that evaluates to a value. Example:
2 + 2
All examples are in pseudocode, not tied to any particular language. Also note that these are not exclusive categories, they can overlap.
Out of the three language tags you've chosen, I'm only very familliar with Python, but I believe many other languages have a similar view of these concepts. All the example code here is Python.
A statement is a thing that is executed; an "instruction to do something" that the language implementation understands. e.g.
print "Hello World"
pass
def foo(n):
return n + 1
if condition:
print 'yay'
else:
print 'doh'
The above block contains a print statement, a pass statement, a function definition statement, and an if/else statement. Note that the function definition and the if/else statement are compound statements; they contain other statements (possibly many of them, and possibly other compound statements).
An expression is something that can be evaluated to produce a value. e.g.
1
"foo"
2 * 6
function(argument)
None
The above contains a numeric literal expression, a string literal expression, an expression involving numeric operators, a function call expression, and the literal None expression. Other than literals and variables, expressions are made up of other expressions. In function(argument), function and argument are also both expressions.
The key difference is that statements are instructions that tell the language implementation to "go do something". Expressions are evaluated to a value (which possibly requires to language implementation to "go do something" on the way).
A consequence of this is that anywhere you see a value (including an expression), you could substitute any other expression and you would still get something that makes some sort of sense. It may fail to compile, or throw exceptions at runtime, or whatever, but on at least some level you can understand what's going on.
A statement can never appear inside an expression (I believe this is not true in Ruby and Javascript in some sense, as they allow literal code blocks and functions which are then used as a value as a whole, and functions and code blocks contain statements; but that's kind of different from what I'm talking about). An expression must have a value (even if it's an uninteresting one like None). A statement is a command; it doesn't make sense for it to appear as part of an expression, because it has no value.
Many languages also allow expressions to be used as statements. The usual meaning of this is "evaluate this expression to get a value, then throw it away". In Python, functions that always return None are usually used this way:
write_to_disk(result)
It's used as a "command", so it looks like a statement, but technically it's an expression, we just don't use the value it evaluates to for anything. You can argue that a "bare expression" is one of the possible statements in a language (and they're often parsed that way).
Some languages though distinguish between functions that must be used like statements with no return value (often called procedures) and functions that are used like an expression, and give you errors or warnings for using a function like a statement, and definitely give you an error for using a procedure as an expression.
So, if foo is an expression, I can write 1 + foo and while it may be result in a type error, it at least makes that much sense. If foo is a statement, then 1 + foo is usually a parse error; the language implementation won't even be able to understand what you're trying to say.
A function on the other hand, is a thing you can call. It's not really either an expression or a statement in itself. In Python, you use a def statement to create a function, and a function call is an expression. The name bound to the function after you create it is also an expression. But the function itself is a value, which isn't exactly an expression when you get technical, but certainly isn't a statement.
So, for loops. This is a for loop in Python:
for thing in collection:
do_stuff(thing)
Looks like a statement (a compound statement, like an if statement). And to prove it, this is utterly meaningless (and a parse error):
1 + for thing in collection:
do_stuff(thing)
In some languages though, the equivalent of a for loop is an expression, and has a value, to which you can attempt to add 1. In some it's even a function, not special syntax baked into the language.
This answer is relevant to Python 2.7.2. Taken from the python tutorial:
"4. More Control Flow Tools
4.2. for Statements:
The for statement in Python differs a bit from what you may be used to in C or Pascal. Rather than always iterating over an arithmetic progression of numbers (like in Pascal), or giving the user the ability to define both the iteration step and halting condition (as C), Python’s for statement iterates over the items of any sequence (a list or a string), in the order that they appear in the sequence."