Does this grammar define right-associativity by right recursion - javascript

I'm reading this article about adding the ** operator to the EcmaScript spec where the author states the following:
Exponentiation must be evaluated before multiplication and more
importantly, the BNF grammar must be written such that the operator’s
right-associativity is clearly defined (unlike
MultiplicativeExpression, which is left-associative).
And he defines the new non-terminal ExponentiationExpression symbol in the grammar as:
ExponentiationExpression :
UnaryExpression[?Yield]
UnaryExpression[?Yield] ** ExponentiationExpression[?Yield]
MultiplicativeExpression[Yield] :
ExponentiationExpression[?Yield]
MultiplicativeExpression[?Yield] MultiplicativeOperator ExponentiationExpression[?Yield]
This article states that:
To write a grammar that correctly expresses operator associativity:
For left associativity, use left recursion.
For right associativity,
use right recursion.
It seems that he follows that rule and defines the associativity by using right recursion for the ExponentiationExpression here:
ExponentiationExpression -> UnaryExpression[?Yield] ** ExponentiationExpression[?Yield]
Am I right?

Am I right?
Yes.

Related

How to make a chained comparison in Jison (or Bison)

I'm working on a expression parser made in Jison, which supports basic things like arithmetics, comparisons etc. I want to allow chained comparisons like 1 < a < 10 and x == y != z. I've already implemented the logic needed to compare multiple values, but I'm strugling with the grammar – Jison keeps grouping the comparisons like (1 < a) < 10 or x == (y != z) and I can't make it recognize the whole thing as one relation.
This is roughly the grammar I have:
expressions = e EOF
e = Number
| e + e
| e - e
| Relation %prec '=='
| ...
Relation = e RelationalOperator Relation %prec 'CHAINED'
| e RelationalOperator Relation %prec 'NONCHAINED'
RelationalOperator = '==' | '!=' | ...
(Sorry, I don't know the actual Bison syntax, I use JSON. Here's the entire source.)
The operator precedence is roughly: NONCHAINED, ==, CHAINED, + and -.
I have an action set up on e → Relation, so I need that Relation to match the whole chained comparison, not only a part of it. I tried many things, including tweaking the precedence and changing the right-recursive e RelationalOperator Relation to a left-recursive Relation RelationalOperator e, but nothing worked so far. Either the parser matches only the smallest Relation possible, or it warns me that the grammar is ambiguous.
If you decided to experiment with the program, cloning it and running these commands will get you started:
git checkout develop
yarn
yarn test
There are basically two relatively easy solutions to this problem:
Use a cascading grammar instead of precedence declarations.
This makes it relatively easy to write a grammar for chained comparison, and does not really complicate the grammar for binary operators nor for tight-binding unary operators.
You'll find examples of cascading grammars all over the place, including most programming languages. A reasonably complete example is seen in this grammar for C expressions (just look at the grammar up to constant_expression:).
One of the advantages of cascading grammars is that they let you group operators at the same precedence level into a single non-terminal, as you try to do with comparison operators and as the linked C grammar does with assignment operators. That doesn't work with precedence declarations because precedence can't "see through" a unit production; the actual token has to be visibly part of the rule with declared precedence.
Another advantage is that if you have specific parsing needs for chained operators, you can just write the rule for the chained operators accordingly; you don't have to worry about it interfering with the rest of the grammar.
However, cascading grammars don't really get unary operators right, unless the unary operators are all at the top of the precedence hierarchy. This can be seen in Python, which uses a cascading grammar and has several unary operators low in the precedence hierarchy, such as the not operator, leading to the following oddity:
>>> if False == not True: print("All is well")
File "<stdin>", line 1
if False == not True: print("All is well")
^
SyntaxError: invalid syntax
That's a syntax error because == has higher precedence than not. The cascading grammar only allows an expression to appear as the operand of an operator with lower precedence than any operator in the expression, which means that the expression not True cannot be the operand of ==. (The precedence ordering allows not a == b to be grouped as not (a == b).) That prohibition is arguably ridiculous, since there is no other possible interpretation of False == not True other than False == (not True), and the fact that the precedence ordering forbids the only possible interpretation makes the only possible interpretation a syntax error. This doesn't happen with precedence declarations, because the precedence declaration is only used if there is more than one possible parse (that is, if there is really an ambiguity).
Your grammar puts not at the top of the precedence hierarchy, although it should really share that level with unary minus rather than being above unary minus [Note 1]. So that's not an impediment to using a cascading grammar. However, I see that you also want to implement an if … then … else operator, which is syntactically a low-precedence prefix operator. So if you wanted 4 + if x then 0 else 1 to have the value 5 when x is false (rather than being a syntax error), the cascading grammar would be problematic. You might not care about this, and if you don't, that's probably the way to go.
Stick with precedence declarations and handle the chained comparison as an exception in the semantic action.
This will allow the simplest possible grammar, but it will complicate your actions a bit. To implement it, you'll want to implement the comparison operators as left-associative, and then you'll need to be able to distinguish in the semantic actions between a comparison (which is a list of expressions and comparison operators) from any other expression (which is a string). The semantic action for a comparison operator needs to either extend or create the list, depending on whether the left-hand operand is a list or a string. The semantic action for any other operator (including parenthetic grouping) and for the right-hand operand in a comparison needs to check if it has received a list, and if so compile it into a string.
Whichever of those two options you choose, you'll probably want to fix the various precedence errors in the existing grammar, some of which were already present in your upstream source (like the unary minus / not confusion mentioned above). These include:
Exponentiation is configured as left-associative, whereas it is almost universally considered a right-associative operator. Many languages also make it higher precedence than unary minus, as well, since -a2 is pretty well always read as the negative of a squared rather than the square of minus a (which would just be a squared).
I suppose you are going to ditch the ternary operator ?: in favour of your if … then … else operator. But if you leave ?: in the grammar, you should make it right associative, as it is in every language other than PHP. (And the associativity in PHP is generally recognised as a design error. See this summary.)
The not in operator is actually two token, not and in, and not has quite high precedence. And that's how it will be parsed by your grammar, with the result that 4 + 3 in (7, 8) evaluates to true (because it was grouped as (4 + 3) in (7, 8)), while 4 + 3 not in (7, 8) evaluates rather surprisingly to 5, having been grouped as 4 + (3 not in (7, 8)).
Notes
If you used a cascading precedence grammar, you'd see that only one of - not 0 and not - 0 is parseable. Of course, both are probably type violations, but that's not something the syntax should concern itself with.

Order of calculation in JS

The interpreter built by my university, codeboot.org, offers step-by-step execution for an expression. As a result, I was able to see how the program reads an arithmetic expression. And this is where I start to confus.
For example, this expression: 10-5+(7+2)/3
We always say that we should calculate the expression in the parenthesis, as a result, this is what the order that I expect
7+2=9, 9/3=3, 10-5=5, 5+3=8
However, what the interpreter executes is completely different.
10-5=5, 7+2=9, 9/3=3, 5+3=8
Even though the result is the same, but why would it calculate 10-5 first? and what happens with "we have to calculate whatever is in the parenthesis first"? This makes me really confusing
I would like to know if this is the right behavior or not that the interpreter always goes from left to right and calculate whatever it can calculate first. Instead of jumping right into the () as we would expect
"Do the parentheses first" is not a rule in JS. And "go from left to right" isn't really a rule either. E.g. consider 1 + 4 * 6. Strict left-to-right would result in
1+4 = 5, 5*6 = 30
and that's not what JS does.
Instead, JS parses your expression into an expression tree, and then evaluates it starting at the root of the tree. (Strictly speaking, a JS implementation implementation isn't required to build a tree, but it's required to give the same results as if it did.)
For instance, your example expression 10-5+(7+2)/3 would result in a tree roughly like this:
AdditiveExpression:
AdditiveExpression
AdditiveExpression
MultiplicativeExpression
... NumericLiteral 10
- -
MultiplicativeExpression
... NumericLiteral 5
+ +
MultiplicativeExpression
MultiplicativeExpression
... ParenthesizeExpression
( (
Expression
... AdditiveExpression 7+2
) )
MultiplicativeOperator /
ExponentiationExpression
... NumericLiteral 3
where:
I've used indentation to convey nesting;
I've used "..." when I've left out lots of intermediate derivations; and
I haven't bothered to give the full sub-tree for "7+2".
(I couldn't find a way to get codeboot.org to show its parse tree. If there is some way, or if you use some other tool to show an expression's parse tree, note that it may not look exactly as above, but it should be similar enough that it will give the same behavior.)
To evaluate the expression, it starts at the root, an AdditiveExpression whose children are:
another AdditiveExpression (for 10-5),
the + token, and
a MultiplicativeExpression (for (7+2)/3).
The rule is to
(a) evaluate the left operand, then
(b) evaluate the right, then
(c) perform the addition on the results.
So that's why (a) 10-5 => 5 is the first thing your interpreter calculates.
Next is to (b) evaluate the MultiplicativeExpression for (7+2)/3. The rule here is similar, so we need to:
(b1) evaluate the left operand (the MultiplicativeExpression for (7+2)), then
(b2) evaluate the right operand (the ExponentiationExpression for 3), then
(b3) perform the operation indicated by the MultiplicativeOperator /.
So (b1) 7+2 => 9 is the next thing,
then (b2) 3 => 3,
then (b3) 9/3 => 3.
We're now finished step (b), so we proceed to (c) 5+3 => 8.
This matches the series of calculations that your interpreter performs.

Why is -1**2 a syntax error in JavaScript?

Executing it in the browser console it says SyntaxError: Unexpected token **.
Trying it in node:
> -1**2
...
...
...
...^C
I thought this is an arithmetic expression where ** is the power operator. There is no such issue with other operators.
Strangely, typing */ on the second line triggers the execution:
> -1**2
... */
-1**2
^^
SyntaxError: Unexpected token **
What is happening here?
Executing it in the browser console says SyntaxError: Unexpected token **.
Because that's the spec. Designed that way to avoid confusion about whether it's the square of the negation of one (i.e. (-1) ** 2), or the negation of the square of one (i.e. -(1 ** 2)). This design was the result of extensive discussion of operator precedence, and examination of how this is handled in other languages, and finally the decision was made to avoid unexpected behavior by making this a syntax error.
From the documentation on MDN:
In JavaScript, it is impossible to write an ambiguous exponentiation expression, i.e. you cannot put a unary operator (+/-/~/!/delete/void/typeof) immediately before the base number.
The reason is also explained in that same text:
In most languages like PHP and Python and others that have an exponentiation operator (typically ^ or **), the exponentiation operator is defined to have a higher precedence than unary operators such as unary + and unary -, but there are a few exceptions. For example, in Bash the ** operator is defined to have a lower precedence than unary operators.
So to avoid confusion it was decided that the code must remove the ambiguity and explicitly put the parentheses:
(-1)**2
or:
-(1**2)
As a side note, the binary - is not treated that way -- having lower precedence -- and so the last expression has the same result as this valid expression:
0-1**2
Exponentiation Precedence in Other Programming Languages
As already affirmed in above quote, most programming languages that have an infix exponentiation operator, give a higher precedence to that operator than to the unary minus.
Here are some other examples of programming languages that give a higher precedence to the unary minus operator:
bc
VBScript
AppleScript
COBOL
Rexx
Orc

Clarification of Operand in Variable Assignment

I am doing this for the benefit of Javascript but the knowledge and terms cross all languages I would imagine. This is why I included JAVA and C as programmers knowledge on the subject from these fields are generally higher level.
If the question has been posed and answered, kindly just let me know.
I understand the basics of operators and operands.
1 + 2 = 3
1 and 2 are the operands and + is the operator. Solutions to the expression are not considered operands as they are the returned value.
If I am wrong with this summary please let me know
My question is that in assigning a value to variable
var x = 1
Is the variable considered to be the operand in this instance? My guess would be yes, as x is being assigned via an operator the value 1. But is it not, or are both x and 1 the operands with = being the assignment operator as the solution is x is now 1.
= is a simple assignment operator that assigns values from right side operands to the variable on the left side.
Example: x = y + z will assign value of y + z into x
So it is clear that = is an operator having left and right sides as operands.
The java spec tells us the following about the assignment operator:
The result of the first operand of an assignment operator must be a variable
So yes, the left hand side of the assignment operator is an operand.
A little further on we can read:
Next, the right hand operand is evaluated.
So the right hand side is an operand too!
Although i don't know why it would be important to know if the java developers call the left/right hand side of an assignment an 'operand' or not!

Meaning and function of ^ (caret) sign in javascript [duplicate]

This question already has answers here:
What does the ^ (caret) symbol do in JavaScript?
(5 answers)
Closed 6 years ago.
In a web page, I see following script snippet.
(d.charCodeAt(i)^k.charCodeAt(i)).toString()
It was a part from a for-loop, and I know what charCodeAt(i) is, but I really wondered what is the functionality of ^ sign... I make some search but failed to find anything...
What is ^ and what function or operator exists in Python or other programming languages that do the same job?
It is the bitwise XOR operator. From the MDN docs:
[Bitwise XOR] returns a one in each bit position for which the corresponding bits of
either but not both operands are ones.
Where the operands are whatever is on the left or right of the operator.
For example, if we have two bytes:
A 11001100
B 10101010
We end up with
Q 01100110
If a bit in A is set OR a bit in B is set, but NOT both, then the result is a 1, otherwise it is 0.
In the example you give, it will take the binary representation of the ASCII character code from d.charCodeAt(i) and k.charCodeAt(i) and XOR them. It does the same in Python, C++ and most other languages. It is not to be confused with the exponential operator in maths-related contexts; languages will provide a pow() function or similar. JavaScript for one has Math.pow(base, exponent).
In Javascript it's the bitwise XOR operator - exclusive or. Only returns true if one or the other of the operands are true, but if they're both true or both false, it returns false.
In Python, it does the same thing.
Wikipedia's XOR page.
It's a bitwise XOR. It performs an "exclusive or" in each bit of the operands.

Categories

Resources