Accessing object key with dot notation and space - javascript

I came across the following:
var object = {};
object.name = 'ABC';
console.log(object. name); // this is still valid
Notice the space after object.
Why is this valid? Is there any ECMA specification for this?
Same is true for all the inherited properties for different data types.
I've tested this on a node terminal.
Thanks!

Why is this valid?
Because whitespace is largely (though not entirely) irrelevant in the JavaScript syntax. You can safely insert any whitespace other than a line break between two tokens (and in most, but not all cases, you can insert line breaks as well; the "most" is due to ASI). You can't insert spaces within tokens (because it breaks them up into two tokens), but you can between tokens.
As Federico klez Culloca points out (link), . is an operator, just like + or *. The fact we generally don't put spaces around it, but do put spaces around them, is simply convention.
These are all valid:
console.log(object.name);
console.log(
object.name
);
console.log(
object . name
);
console.log(
object
.
name
);
Is there any ECMA specification for this?
Of course, the specification itself. Specifically here and here. From that last link:
Input elements other than white space and comments form the terminal symbols for the syntactic grammar for ECMAScript and are called ECMAScript tokens. These tokens are the reserved words, identifiers, literals, and punctuators of the ECMAScript language. Moreover, line terminators, although not considered to be tokens, also become part of the stream of input elements and guide the process of automatic semicolon insertion (11.9). Simple white space and single-line comments are discarded and do not appear in the stream of input elements for the syntactic grammar.

Related

[Nearley]: how to parse matching opening and closing tag

I'm trying to parse a very simple language with nearley: you can put a string between matching opening and closing tags, and you can chain some tags. It looks like a kind of XML, but with[ instead of < , with tag always 2 chars long, and without nesting.
[aa]My text[/aa][ab]Another Text[/ab]
But I don't seem to be able to parse correctly this, as I get the grammar should be unambiguous as soon as I have more than one tag.
The grammar that I have right now:
#builtin "string.ne"
#builtin "whitespace.ne"
openAndCloseTag[X] -> "[" $X "]" string "[/" $X "]"
languages -> openAndCloseTag[[a-zA-Z] [a-zA-Z]] (_ openAndCloseTag[[a-zA-Z] [a-zA-Z]]):*
string -> sstrchar:* {% (d) => d[0].join("") %}
And related, Ideally I would like the tags to be case insensitive (eg. [bc]TESt[/BC] would be valid)
Has anyone any idea how we can do that? I wasn't able to find a nearley XML parser example .
Your language is almost too simple to need a parser generator. And at the same time, it is not context free, which makes it difficult to use a parser generator. So it is quite possible that the Nearly parser is not the best tool for you, although it is probably possible to make it work with a bit of hackery.
First things first. You have not actually provided an unambiguous definition of your language, which is why your parser reports an ambiguity. To see the ambiguity, consider the input
[aa]My text[/ab][ab]Another Text[/aa]
That's very similar to your test input; all I did was swap a pair of letters. Now, here's the question: Is that a valid input consisting of a single aa tag? Or is it a syntax error? (That's a serious question. Some definitions of tagging systems like this consider a tag to only be closed by a matching close tag, so that things which look like different tags are considered to be plain text. Such systems would accept the input as a single tagged value.)
The problem is that you define string as sstrchar:*, and if we look at the definition of sstrchar in string.ne, we see (leaving out the postprocessing actions, which are irrelevant):
sstrchar -> [^\\'\n]
| "\\" strescape
| "\\'"
Now, the first possibility is "any character other than a backslash, a single quote or a newline", and it's easy to see that all of the characters in [/ab] are in sstrchar. (It's not clear to me why you chose sstrchar; single quotes don't appear to be special in your language. Or perhaps you just didn't mention their significance.) So a string could extend up to the end of the input. Of course, the syntax requires a closing tag, and the Nearley parser is determined to find a match if there is one. But, in fact, there are two of them. So the parser declares an ambiguity, since it doesn't have any criterion to choose between the two close tags.
And here's where we come up against the issue that your language is not context-free. (Actually, it is context-free in some technical sense, because there are "only" 676 two-letter case-insensitive tags, and it would theoretically be possible to list all 676 possibilities. But I'm guessing you don't want to do that.)
A context-free grammar cannot express a language that insists that two non-terminals expand to the same string. That's the very definition of context-free: if one non-terminal can only match the same input as a previous non-terminal, then
the second non-terminals match is dependent on the context, specifically on the match produced by the first non-terminal. In a context-free grammar, a non-terminal expands to the same thing, regardless of the rest of the text. The context in which the non-terminal appears is not allowed to influence the expansion.
Now, you quite possibly expected that your macro definition:
openAndCloseTag[X] -> "[" $X "]" string "[/" $X "]"
is expressing a context-sensitive match by repeating the $X macro parameter. But it is not by accident that the Nearley documentation describes this construct as a macro. X here refers exactly to the string used in the macro invocation. So when you say:
openAndCloseTag[[a-zA-Z] [a-zA-Z]]
Nearly macro expands that to
"[" [a-zA-Z] [a-zA-Z] "]" string "[/" [a-zA-Z] [a-zA-Z] "]"
and that's what it will use as the grammar production. Observe that the two $X macro parameters were expanded to the same argument, but that doesn't mean that will match the same input text. Each of those subpatterns will independently match any two alphabetic characters. Context-freely.
As I alluded to earlier, you could use this macro to write out the 676 possible tag patterns:
tag -> openAndCloseTag["aa"i]
| openAndCloseTag["ab"i]
| openAndCloseTag["ac"i]
| ...
| openAndCloseTag["zz"i]
If you did that (and you managed to correctly list all of the possibilities) then the parser would not complain about ambiguity as long as you never use the same tag twice in the same input. So it would be ok with both your original input and my altered input (as long as you accept the interpretation that my input is a single tagged object). But it would still report the following as ambiguous:
[aa]My text[/aa][aa]Another Text[/aa]
That's ambiguous because the grammar allows it to be either a single aa tagged string (whose text includes characters which look like close and open tags) or as two consecutive aa tagged strings.
To eliminate the ambiguity you would have to write the string pattern in a way which does not permit internal tags, in the same way that sstrchar doesn't allow internal single quotes. Except, of course, it is not nearly so simple to match a string which doesn't contain a pattern, than to match a string which doesn't contain a single character. It could be done using Nearley, but I really don't think that it's what you want.
Probably your best bet is to use native Javascript regular expressions to match tagged strings. This will prove simpler because Javascript regular expressions are much more powerful than mathematical regular expressions, even allowing the possibility of matching (certain) context-sensitive constructions. You could, for example, use Javascript regular expressions with the Moo lexer, which integrates well into Nearley. Or you could just use the regular expressions directly, since once you match the tagged text, there isn't much else you need to do.
To get you started, here's a simple Javascript regular expression which matches tagged strings with matching case-insensitive labels (the i flag at the end):
/\[([a-zA-Z]{2})\].*?\[\/\1\]/gmi
You can play with it online using Regex 101

Why does a number inside parentheses have methods, but a number outside parentheses does not? [duplicate]

If I try to write
3.toFixed(5)
there is a syntax error. Using double dots, putting in a space, putting the three in parentheses or using bracket notation allows it to work properly.
3..toFixed(5)
3 .toFixed(5)
(3).toFixed(5)
3["toFixed"](5)
Why doesn't the single dot notation work and which one of these alternatives should I use instead?
The period is part of the number, so the code will be interpreted the same as:
(3.)toFixed(5)
This will naturally give a syntax error, as you can't immediately follow the number with an identifier.
Any method that keeps the period from being interpreted as part of the number would work. I think that the clearest way is to put parentheses around the number:
(3).toFixed(5)
You can't access it because of a flaw in JavaScript's tokenizer. Javascript tries to parse the dot notation on a number as a floating point literal, so you can't follow it with a property or method:
2.toString(); // raises SyntaxError
As you mentioned, there are a couple of workarounds which can be used in order make number literals act as objects too. Any of these is equally valid.
2..toString(); // the second point is correctly recognized
2 .toString(); // note the space left to the dot
(2).toString(); // 2 is evaluated first
To understand more behind object usage and properties, check out the Javascript Garden.
It doesn't work because JavaScript interprets the 3. as being either the start of a floating-point constant (such as 3.5) or else an entire floating-point constant (with 3. == 3.0), so you can't follow it by an identifier (in your case, a property-name). It fails to recognize that you intended the 3 and the . to be two separate tokens.
Any of your workarounds looks fine to me.
This is an ambiguity in the Javascript grammar. When the parser has got some digits and then encounters a dot, it has a choice between "NumberLiteral" (like 3.5) or "MemberExpression" (like 3.foo). I guess this ambiguity cannot be resolved by lookahead because of scientific notation - should 3.e2 be interpreted as 300 or a property e2 of 3? Therefore they voluntary decided to prefer NumberLiterals here, just because there's actually not very much demand for things like 3.foo.
As others have mentioned, Javascript parser interprets the dot after Integer literals as a decimal point and hence it won't invoke the methods or properties on Number object.
To explicitly inform JS parser to invoke the properties or methods on Integer literals, you can use any of the below options:
Two Dot Notation
3..toFixed()
Separating with a space
3 .toFixed()
Write integer as a decimal
3.0.toFixed()
Enclose in parentheses
(3).toFixed()
Assign to a constant or variable
const nbr = 3;
nbr.toFixed()

In What Standard is it Made Official That JSON Object Property Names Must Be Double-Quoted?

Checked here:
http://www.json.org/
and here:
http://www.ietf.org/rfc/rfc4627.txt?number=4627
All I'm seeing is that names must be strings, not that they can't use single-quotes. Don't get me wrong, I'm on board with quoted names for JSON. It protects devs from using property names that aren't legit JS variable names and also powerful use of JSON frequently puts traditional values in property-names for things like map-reduction of 2D arrays modeling tables.
I also think it would make sense for all names to consistently use one or the other quote-type in order to avoid assumptions one might make while trying to parse JSON in some language that doesn't have convenient JSON parsing libraries/native-methods coming out of its pores, but I don't see anything in these specs that insists it must be double or single.
All I see in the second link is that they must be strings. Where is it established that they must be double-quoted as a lot of the JSON validators seem to think? Is there another source? If so who own JSON town? I'm feeling like a raggedy-man who lost his way.
From page 4 of the RFC that you link to:
string = quotation-mark *char quotation-mark
...
quotation-mark = %x22 ; "
Property names must be strings, but strings must be delimited with quotation marks (not apostrophes).
See also the diagram of a string on json.org. Note that it starts and ends with " and not branches that would allow ' as an alternative.
Taken from the http://www.json.org site:
Strings are defined as a character sequence enclosed in double quotes.
In the RFC:
string = quotation-mark *char quotation-mark
where
quotation-mark = %x22 ; "

Why can't I access a property of an integer with a single dot?

If I try to write
3.toFixed(5)
there is a syntax error. Using double dots, putting in a space, putting the three in parentheses or using bracket notation allows it to work properly.
3..toFixed(5)
3 .toFixed(5)
(3).toFixed(5)
3["toFixed"](5)
Why doesn't the single dot notation work and which one of these alternatives should I use instead?
The period is part of the number, so the code will be interpreted the same as:
(3.)toFixed(5)
This will naturally give a syntax error, as you can't immediately follow the number with an identifier.
Any method that keeps the period from being interpreted as part of the number would work. I think that the clearest way is to put parentheses around the number:
(3).toFixed(5)
You can't access it because of a flaw in JavaScript's tokenizer. Javascript tries to parse the dot notation on a number as a floating point literal, so you can't follow it with a property or method:
2.toString(); // raises SyntaxError
As you mentioned, there are a couple of workarounds which can be used in order make number literals act as objects too. Any of these is equally valid.
2..toString(); // the second point is correctly recognized
2 .toString(); // note the space left to the dot
(2).toString(); // 2 is evaluated first
To understand more behind object usage and properties, check out the Javascript Garden.
It doesn't work because JavaScript interprets the 3. as being either the start of a floating-point constant (such as 3.5) or else an entire floating-point constant (with 3. == 3.0), so you can't follow it by an identifier (in your case, a property-name). It fails to recognize that you intended the 3 and the . to be two separate tokens.
Any of your workarounds looks fine to me.
This is an ambiguity in the Javascript grammar. When the parser has got some digits and then encounters a dot, it has a choice between "NumberLiteral" (like 3.5) or "MemberExpression" (like 3.foo). I guess this ambiguity cannot be resolved by lookahead because of scientific notation - should 3.e2 be interpreted as 300 or a property e2 of 3? Therefore they voluntary decided to prefer NumberLiterals here, just because there's actually not very much demand for things like 3.foo.
As others have mentioned, Javascript parser interprets the dot after Integer literals as a decimal point and hence it won't invoke the methods or properties on Number object.
To explicitly inform JS parser to invoke the properties or methods on Integer literals, you can use any of the below options:
Two Dot Notation
3..toFixed()
Separating with a space
3 .toFixed()
Write integer as a decimal
3.0.toFixed()
Enclose in parentheses
(3).toFixed()
Assign to a constant or variable
const nbr = 3;
nbr.toFixed()

Why are some object-literal properties quoted and others not? [duplicate]

This question already has answers here:
What is the difference between object keys with quotes and without quotes?
(5 answers)
Closed 9 years ago.
I see this all the time: object literals declared such that some keys are surrounded with quotes and others are not. An example from jQuery 1.4.2:
jQuery.props = {
"for": "htmlFor",
"class": "className",
readonly: "readOnly",
maxlength: "maxLength",
cellspacing: "cellSpacing",
rowspan: "rowSpan",
colspan: "colSpan",
tabindex: "tabIndex",
usemap: "useMap",
frameborder: "frameBorder"
};
What is the significance of wrapping the first two property keys (for and class) with quotes, while leaving the others quote-less? Are there any differences at all?
I've been poking around the ECMAScript 5 specification; all I've been able to find is [Note 6 of Section 15.12.3, emphasis mine]:
NOTE 6 An object is rendered as an
opening left brace followed by zero or
more properties, separated with
commas, closed with a right brace. A
property is a quoted String
representing the key or property name,
a colon, and then the stringified
property value. An array is rendered
as an opening left bracket followed by
zero or more values, separated with
commas, closed with a right bracket.
However, this refers only to the stringification of JSON.
Those are Javascript reserved words, and (though not really necessary) the syntax of the language requires that they be quoted.
Strictly speaking, pure "JSON" notation requires that all of the "key" strings be quoted. Javascript itself however is OK with keys that are valid identifiers (but not reserved words) being unquoted.
There is a reason at this point (two plus years later) to quote object literal properties. If one wants to minify their code using the Closure Compiler they may need to make the properties accessible to other source files. In that case, they will want to avoid having symbols renamed by the compiler. By quoting the property name, the Closure Compiler will not minify (rename) them.
See: Removal of code you want to keep
(This applies to at least the ADVANCED_OPTIMIZATIONS setting.)
Javascript language keywords or reserved keywords are always surrounded by quotes in there.
for and class are language keywords. Your interpreter would throw a SyntaxError when those are unquoted.
See section 7.6.1.1 in the Spec you linked to.
Javascript has a lot of reserved words that are not actually used by the language which I think were reserved for possible future use. class is one of these even though Javascript does not actually use classes. Another is goto and there's absolutely no chance of that ever being used. The result, however, is that if you want to use these as a json key then it has to be quoted. Strictly speaking you should probably always quote your keys just to avoid the possibility of falling foul of the javascript unused reserved word trap (mind you - I never do).

Categories

Resources