I have a Python grammar and a textbox where users can write simple python expressions, such as this one:
(lambda x: str(x))(wanted_input)
The expression is then injected with wanted_input (by using an artificial context created when we execute the python expr).
I am interested if it is possible to use Parse Tree Listeners to extract wanted_input which are basically undefined global variables.
Yes, that's possible.
I believe there are exactly three types of expressions that can introduce new variable names: lambdas, list comprehensions and generator comprehensions. So in your listener or visitor, you'll want to add the newly introduced variables to the current scope when entering these three types of expressions and then remove them again when exiting (making sure to handle the case where a variable is shadowed - some kind of stack is usually the tool of choice here). Then when you see a variable, you simply check whether it's defined in the current scope. If not, it's one of the variables you want to extract.
Related
It's a well known fact that neither Javascript's eval keyword nor Function objects created from strings should ever for any reason be used to run untrusted code.
However, I'm wondering if ES6 proxies change that. Consider:
let env = {eval: eval};
let proxy = new Proxy(env, { has: () => true });
with(proxy) {eval('...')}
The proxy object pretends to have all possible properties, which means that it blocks the search of higher scopes. Within the with block, any properties not set on env appear undefined, and any global properties set inside the with block are actually set on env.
This seems to allow me to set up a completely controlled and isolated environment for the evaled code to run in. What are the risks?
Here are a few concerns I can see:
Don't put anything that references window, or document, or localStorage, or anything else sensitive, into env.
Don't put any mutable object into env unless you're ok with untrusted code mutating it.
Solution: make deep copies if necessary.
Code inside the with block has no access to anything outside it. If it needs things like Math, Object, or String, they have to be put in env - which means these can be modified by malicious code. Even the eval function in my minimal example above can be modified.
Solution: Create proxies for these objects to white-list read-only access to specific properties.
As long as you follow these guidelines, is this actually safe? Are there other concerns?
It is quite easy to break out of this environment, via a number of different ways, some or all of which might possibly be mitigated:
Object, Array, and RegExp literals ({ }, [ ], and /.../) are unimpeded by the Proxy and allow access to (and mutation of) Object.protoype, Array.prototype, and RegExp.prototype. You can, however, lock these with Object.freeze before running your eval.
You must delete env.eval within your evaled string, or else the script can execute global code by renaming the eval function like globalEval = eval;
You cannot prevent the creation of new functions, which may use a global this object: (function () { this.globalFunc(); })(). Possibly enforcing strict mode by appending "use strict"; to your evaled input could eliminate this escape vector.
Any access to the Function constructor (via (a=>a).__proto__.constructor) allows execution of global code. You can delete Function.constructor to prevent this, but there may be other ways to access Function.
It took me a while but I finally figured out what the purpose of symbols in ECMAScript 6 is: avoiding name collision when attaching properties to shared objects - HTML elements e.g. (In case you're stuck on the same question, I recommend this article.)
But then I stumbled upon Symbol.for(). Apparently ECMAScript 6 will maintain a global symbol registry which you can query with this function by providing the symbol description. Come again? If I use symbols to avoid name collisions, why would I want code other than my own to use them? (*) And how would I avoid name collisions in that global registry? Sharing of symbols seems to completely subvert the concept and a global registry doubly so.
(*) Yes, I know symbols aren't truly private, but that's besides the point.
If you don't want your symbols to be available in GlobalSymbolRegistry, just don't use Symbol.for.
Only use it if you want to allow other codes to use your symbol.
In the following example, I create a symbol to store data in DOM elements. And I may want every other code (e.g. internal raw uncompiled handlers) to read that data. So I make the symbol globally available.
var sym = Symbol.for('storeDataInDOM');
document.querySelector('button')[sym] = 'Hello, world!';
<button onclick="alert(this[Symbol.for('storeDataInDOM')])">Click me</button>
It's like creating global variables: should be avoided in general, but has its advantages. But with symbols instead of strings.
If I use symbols to avoid name collisions, why would I want code other than my own to use them?
That's not the only use case of symbols. The two most important other ones are:
they don't collide with string-keyed properties
they are not enumerated by the usual mechanics
Sharing of symbols seems to completely subvert the concept and a global registry doubly so.
Not necessarily. Right from that article you read: "The registry is useful when multiple web pages, or multiple modules within the same web page, need to share a symbol." The best example for these are the intrinsic symbols - they guarantee interoperability across realms, that's why the global symbol registry is more global than your global scope.
For example you might have a library that is loaded in a web page, an iframe and a web worker. If you share data between those environments (realms), all of the three instances of your library would want to use the same symbol.
There also is a real need interoperability between different libraries, which might not even know about each other. Good examples are transducers, algebraic structures or promises. Would ES6 already be in use, all of these would have agreed on common names in the global symbol registry, instead of relying on strings like these or the then method.
Another good example would be custom hooks defined by your engine, e.g. a Symbol.inspect = Symbol.for("inspect") that you can use to define custom stringification behavior to be used by console.log. Admittedly, that symbol does not necessarily need to be made available through the global symbol registry, it could as well be put on that specific library object (e.g. console.inspect = Symbole("console.inspect")).
And how would I avoid name collisions in that global registry?
Just like you previously did with properties, or global module objects: by using very long, very descriptive names - or by good faith. Also there are some naming conventions.
I invented the most useful feature of Symbol.for() call. If there is using symbols in your code sometimes it is difficult to use conditional breakpoints while debugging. For example, you need to catch if the variable equals the value which is of symbol type and this value binded in the different module. The first difficult way is to use this value as a constant and export it from that module. In this case, the condition of the breakpoint will look:
catchedVariable === exportedSymbolConst
But the easiest way is to temporarily change the code inside the module adding .for to Symbol. Then you can write the condition:
catchedVariable === Symbol.for('string_key')
After the successful debugging you will be changing the code back just removing .for part.
This is a two part question: General, and Specific.
For the general: I often find myself wondering what constitutes a viable variable name in JavaScript? I know there are certain 'words' that can not be used as variables in JavaScript; But I have yet to come across either a list of non-viable variable names, or a rule to apply when creating a variable name. I usually err on the side of caution and use obscure names if I am unsure.
It would be nice to know, with certainty, what can be used as a JavaScript variable, and what can not be used.
Any advice?
For the specific: I am wondering if I can use href as a variable name in my JavaScript? Is it viable, or is it reserved?
Afterthought: Perhaps I can extend this question to encompass JavaScript function names as well. What names are viable, and which are reserved? If the two questions are related, I will edit to ask both.
Note: I am not asking which characters can be used in a JavaScript variable; That question is already answered here.
Uhm, actually, you can use any kind of name as a variable name.
Instead of referring to the variable by name, refer to it by array index, since all object properties in JS can be accessed by index*, as well as the fact that global variables are simply properties of the window object.
*a string index can contain literally any kind of character sequence
So the question in turn might be more on the lines of "should I use reserved words as variable names?"
Common sense would say you shouldn't, except when such a name is actually related to the construct and you can't find a suitable replacement.
window['function'] = 2;
window['if'] = 4;
window['var'] = 8;
alert(window['function'] + window['if'] + window['var']);
Warning!
Reserved words are different from native functionality.
Although in many cases you can use names used as reserved keywords as variables, native functionality can actually be overwritten.
For instance Mr Sarris above mentioned Node, (which is a native function not a reserved keyword), you can actually overwrite it by doing window['Node'] = myNewThing;. This has been used in some cases to achieve "wrapper" or "hotfix" functionality, but it is not guaranteed to work in a cross-browser manner (eg; MSIE's console object).
You can find lists of reserved words in JavaScript.
href is certainly fine as a variable name because href is an attribute of an a tag and in no way conflicts with JavaScript naming.
If you are ever in doubt as to whether or not a variable is already in use you can always open the developer tools (F12 in most browsers), go to the console, and type in the name. In this case you'll get:
> href
x ReferenceError: href is not defined
Nothing is using it, so it is yours to use without problem.
Just for kicks if you did enter a reserved word it would look like:
> finally
x SyntaxError: Unexpected token finally
Or if it was a native but already taken word it might look like:
> Node
function Node() { [native code] }
(Node is already defined, and its a native function)
Every programming language has a list of reserved words.
These reserved words consist of the parts that constitute that programming language.
For JavaScript, things like for or function or if are reserved words, since these have special meaning in the language itself. As a rule of thumb you cannot re-use words as identifiers (names) that already have a meaning in that particular language.
The official language specification is a good place to look that up. For JavaScript see the ECMAScript specification, section 7.6.1 (section 7.6. clarifies the other rules that apply to identifier naming).
Your question whether href is okay to use in JS is easily answered by looking there.
rules for variable declaration (on codelifter)
href is ok to use
for ( var i in this ) { console.log(i); }
With this loop, I iterate over all properties of an object. Is it possible to find what local/closure variables exist?
No, there's no way to examine the contents of a scope, because there's no way to get a handle to it. (The global scope is excepted, because there are ways of getting a handle to it.)
What I mean by that is that there's no way to get the runtime to give you a reference to the scope as if it were a JavaScript object. Thus, there's no way to explore the properties; there's nothing for the right-hand side of a "for ... in" loop, in other words.
edit — if one could do this, it would allow for some interesting coding techniques. One could write utility functions, like the new-ish ".bind()" method on the Function prototype, so that the function returned would be able to check for certain special variables in the closure scope, for debugging or logging or other purposes. Thus services that manufacture functions could do some more "powerful" things based on the nature of the client environment. (I don't know of a language that would allow that.)
In Javascript we don't have to declare a variable with var keyword before using it. We can straight away do things like myCount = 12; or myStr = "Hello"; (where myCount, myStr are not declared before). Any such usage, declares and initializes the variables in the 'global' scope.
What could be the reasons for providing this feature? And is it a good standard at all?
UPDATE: My question is not what the differences are between 'using a variable without declaring' and 'declaring and then using' and how it affects scope.
My question is 'why it is allowed in javascript to use a variable directly without declaring it' as most of the programming languages have a strict check on this.
UPDATE : I see the following quoted text as a bad effect of this feature. So, why have this feature at all?
"Suppose there is a globally declared variable x (var x="comparison string") already which i am unaware of and i with intention of creating my own global 'x' inside one of my functions initialize one(x=-1) and there i end up breaking other functionality.
So, is there any good reason at all? for having this feature?
Javascript was intended for very simple scripts in the browser. Requiring variable declarations seemed unnecessary.
Of course, it's an error in the language. And the makers of javascript know that. They wanted to change it. But they couldn't. Why?
Because Microsoft had already reverse engineered JavaScript and created their duplicate JScript, with bugs and all. Microsoft vetoed any changes, even bugfixes, since they were adamant about not breaking anyones scripts. So even if they changed JavaScript, JScript in IE would stay the same.
It's not a good reason. But it's the one we have.
Source: I got my javascript history class from Douglas Crockford: "The JavaScript Programming Language", http://video.yahoo.com/watch/111593/1710507 This part of the story is between 9 and 11 minutes into the video.
Good reasons? Honestly can't think of one, it's one of the few things I really dislike about JS.
It's possible because everything happens within the global scope if not otherwise controlled and JS allows implicit variable creation like this. The cost of this is enormous potential for scoping bugs and pollution, and only benefit given that "this" exists to explicitly define self scope and "window" and "document" exist for global referencing, is saving a few characters - which is no reason at all.
My question is 'why it is allowed in javascript to use a variable directly without declaring it' as most of the programming languages have a strict check on this.
That's the difference between statically and dynamically typed languages. Javascript is dynamically typed, so there is no need to declare first. As it was pointed out in other answers, var keyword is more responsible for scope of a variable than its declaration.
And I don't think that most programming languages have a check on that.
Lua has a similar issue: any non-declared variable is global by default. In the mailing list it's a recurring theme, asking why shouldn't it be 'local by default'. unfortunately, that would introduce very nasty ambiguities in the language, so the conclusion typically is "global by default is bad, local by default is worse"
Because it is a scripting language. Fact kids. And that is how it was designed!