Does JSON.parse() use eval() internally? [duplicate]

Does JSON.parse() use eval() internally? [duplicate] - javascript

This question already has answers here:
What is JSON.parse written in / Is it open source?
(4 answers)
Closed 9 years ago.
Does JSON.parse in modern browsers use eval() internally for evaluating and executing the dynamic code?
Because I have been looking through Douglas Crockford's JSON library. It also uses eval() when using parse() but after preprocessing prior to the actual evaluation. Such as:-
A wall against Unicode characters in the code.
A code shows malicious intent.
Do the modern browsers which supports JSON.parse natively perform this or they follow other protocols?

No, JSON.parse() doesn't Use eval()
This is by design, as eval() being able to execute any arbitrary JavaScript code you feed it, it could execute things you wouldn't want it to. So JSON.parse() does what it says on the tin: it actually parses the whole string and reconstructs and entire object tree.
JSON.parse is usually delegated to an internal function implemented with "native" code, where "native" means whatever is considered "native" in the context of your browser's javascript engine (could be compiled machine code, could be bytecode for a VM, etc...). I don't think there's any strong requirement on that.
Differences in the Implementations?
JSON (the notation) itself is codified by the RFC4627.
Regarding the implemetation of the JSON object and its methods, all modern browsers implementing should behave the same, as they should follow the same specifications for ECMAScript 5's JSON object. However, there's always the chance for potential defects. For instance, V8 originally contained this nasty bug.
Also, note that the implementation listed in comments above are for you to add JSON.parse() support to browsers that do not support it natively (also known as "these damn old browsers you sometimes need to support"). But it doesn't mean that it's necessarily how they implemented it.
For instance, for Google's V8 implementation used in Chrome, see json.js which invokes native code from json_parser.h.

It would be a very funny thing to do, if you think about it.
To understand why, see if this analogy helps: you're traveling with your boss to a country where you speak the language but she doesn't. Since you're fluent, you will serve two roles: as her assistant (doing tasks for her) as well as her translator (telling her what things mean).
So you have these two jobs, which are complementary. Your boss could tell you to do something--in any language you both understand (say, English)--as well as ask you to tell her what something says, like a sign or a document. She could even do both: hand you a set of instructions written in this other language and say, "This was given to me by someone I trust. Please do everything it says here."
In this analogy, reading signs or documents to your boss is like JSON.parse. Your boss handing you instructions and telling you to do everything they say is like eval.
If JavaScript engines used eval internally for JSON.parse, that would be analogous to your boss asking you what a document says, and you choosing to act out everything written in the document in order to explain it to her. Instead of just reading it.

Related

JavaScript: why are some functions not methods?

I was once asked by a student why we write:
parseInt(something)
something.toLowerCase()
that is, why one has the variable as a parameter, while the other is applied to the variable.
I explained that while toLowerCase is a method of string objects, parseInt wasn’t designed that way. OK, so it’s window.parseInt, but that just makes it a method of a different object.
But it struck me as an inconsistency — why are some string or other functions not methods of their corresponding objects?
The question is why? Is there a technical reason why parseInt and other functions are not methods, or is that just a historical quirk?

In general, Javascript was designed in a hurry, so questioning each individual design decision isn't always a productive use of your time.
Having said that, for parseInt in particular, the reason is simple to explain: it accepts pretty much any arbitrary type, like:
parseInt(undefined) // NaN
Since you cannot implement undefined.parseInt(), the only way to do it is to implement it as a static function.
As of ECMAScript 2015, parseInt has been mirrored in Number.parseInt, where it arguably makes more sense than on window. For backwards compatibility window.parseInt continues to exist though.

In this specific case it makes sense with respect to encapsulation.
Consider parseInt() - it is taking a value of an unknown type from an unknown location and extracting an integer value from it. Which object are you going to have it a method of? All of them?
String.toUpperCase() should only take a string as input (else something which may be cast as a string) and will return a string. This is well encapsulated within a small subset of cases, and since values are not strongly typed it seems logical to not have it as a global function.
As for the rest of JavaScript I have no idea nor do I have insight into the real reason it was done this way, but for these specific examples it appears to me to be a reasonable design decision.

The development progress of the JavaScript language is quite fast in recent years. With that in mind, a lot of things are still in the API due to backward compatibility - historical reasons as you said. Although I can't say that's the only reason.
In JavaScript, you can approach a problem not just with Object oriented paradigm (where methods of objects usually shares a common state). Another, functional approach can be applied quite easily without getting into too much trouble with JavaScript language.
JavaScript gives great power to its users with many possibilities of approaching a problem. There is a saying: "With Great Power Comes Great Responsibility".

Why is toString of JavaScript function implementation-dependent?

From the EcmaScript 5 specification
15.3.4.2 Function.prototype.toString( )
An implementation-dependent representation of the function is returned. This representation has the syntax of a FunctionDeclaration. Note in particular that the use and placement of white space, line terminators, and semicolons within the representation String is implementation-dependent.
Why is it implementation-dependent? It shouldn't be too hard to make it output standardized string consisting of the original code of the function. Also the reasons that I can come up with such as optimization, doesn't seem to be too heavily used as pretty much all the browsers give the original code as result of toString.
If the toString wouldn't be implementation-dependent and thus would be standardized to be the original code for function (with the new lines etc. handled on standard way), wouldn't it make it possible to include the functions on JSON?
I do realize that the JSON, despite its name, is independent of JavaScript and thus the functions shouldn't be part of it. But this way the functions could in theory be passed with it as strings, without losing the cross-browser support.

Internally, Function.prototype.toString() has to get the function declaration code for the function, which it may or may not have. According to the MDN page, FF used to decompile the function, and now it stores the declaration with the function, so it doesn't have to decompile it.
Since Gecko 17.0 (Firefox 17 / Thunderbird 17 / SeaMonkey 2.14),
Function.prototype.toString() has been implemented by saving the
function's source. The decompiler was removed
*https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Function/toString
Decompiling it requires extra work. Storing it requires extra memory. A given ECMAscript implementation may have different resource requirements.
Further, if it is decompiled, that is dependent on how it was stored in the first place. An engine may be unable to return comments in the original, because it didn't store them when the function was evaluated. Or whitespace/newlines might be different if the engine collapsed them. Or the engine may have optimized the code, such as by ignoring unreachable code, making it not possible to return that code back in the toString() call.
...some engines omit newlines. And others omit comments. And others
omit "dead code". And others include comments around (!) function. And
others hide source completely...
*http://perfectionkills.com/state-of-function-decompilation-in-javascript/
These are just a few reasons why Function.prototype.toString() is implementation dependent.

Is it acceptable style for Node.js libraries to rely on object key order?

Enumerating the keys of javascript objects replays the keys in the order of insertion:
> for (key in {'z':1,'a':1,'b'}) { console.log(key); }
z
a
b
This is not part of the standard, but is widely implemented (as discussed here):
ECMA-262 does not specify enumeration order. The de facto standard is to match
insertion order, which V8 also does, but with one exception:
V8 gives no guarantees on the enumeration order for array indices (i.e., a property
name that can be parsed as a 32-bit unsigned integer).
Is it acceptable practice to rely on this behavior when constructing Node.js libraries?

Absolutely not! It's not a matter of style so much as a matter of correctness.
If you depend on this "de facto" standard your code might fail on an ECMA-262 5th Ed. compliant interpreter because that spec does not specify the enumeration order. Moreover, the V8 engine might change its behavior in the future, say in the interest of performance, e.g.

Definitely do not rely on the order of the keys. If the standard doesn't specify an order, then implementations are free to do as they please. Hash tables often underlie objects like these, and you have no way of knowing when one might be used. Javascript has many implementations, and they are all competing to be the fastest. Key order will vary between implementations, if not now, then in the future.

No. Rely on the ECMAScript standard, or you'll have to argue with the developers about whether a "de facto standard" exists like the people on that bug.

It's not advised to rely on it naively.
You should also do your best to stick to the spec/standard.
However there are often cases where the spec or standard limits what you can do. I'm not sure in programming I've encountered many implementations that deviate or extend the specification often for reasons such as the specification doesn't cater to everything.
Sometime people using specifics of an implementation might have test cases for that, though it's hard to make a reliable test case for beys being in order. It most succeed by accident or rather it's difficult behavior to reliably produce.
If you do rely on an implementation specific then you must document that. If your project requires portability (code to run on other people's setups out of your control and you want maximum compatibility) then in this case it's not a good idea to rely on an implementation specific such as key order.
Where you do have full control of the implementation being used then it's entirely up to you which implementation specifics you use while keeping in mind you may be forced to cater to portability due to the common need or desire to upgrade implementation.
The best form of documentation for cases like this is inline, in the code itself, often with the intention of at least making it easy to identify areas to be changed should you switch from an implementation guaranteeing order to one not doing so.
You can make up the format you like but it can be something like...
/** #portability: insertion_ordered_keys */
for(let key in object) console.log();
You might even wrap such cases up in code:
forEachKeyInOrderOfInsertion(object, console.log)
Again, likely something less overly verbose but enough to identify cases dependent on that.
For where your implementation guarantees key order you're just trans late that to the same as the original for.
You can use a JS function for that with platform detection, templating like CPP, transpiling, etc. You might also want to wrap the object creation and to be very careful about things crossing boundaries. If something loses order before reaching you (like JSON decode of input from a client over the network) then you'll likely not have a solution to that solely withing your library, this can even be just if someone else is calling your library.
Though you'll likely not need those, just make cases where you do something that might break later as a minimum and document that potential exists.
An obvious exception to that is if the implementation guarantees consistency. In that case you will probably be wasting your time decorating everything if it's not really a variability and is already documented via the implementation. The implementation often is a spec or has its own, you can choose to stick to that rather than a more generalised spec.
Ultimately in each case you'll need to make a judgement call, you may also choose to take a chance. As long as you're fully aware of the potential problems including the potential of wasting time avoiding problems you wont necessarily actually have, that is you know all the stakes and have considered your circumstances, then it's up to you what to do. There's no "should" or "shouldn't", it's case specific.
If you're making node.js public libraries or libraries to be widely distributed beyond the scope of your control then I'd say it's not good to rely on implementation specifics. Instead at least have a disclaimer with the release notes that the library is only catering to your stack and that if people want to use it for others then can fix and put in a pull request. Otherwise if not documented, it should be fixed.

Using JavaScript eval to parse JSON

Question: I'm using eval to parse a JSON return value from one of my WebMethods.
I prefer not to add jquery-json because the transfer volume is already quite large.
So I parse the JSON return value with eval.
Now rumors go that this is insecure. Why ?
Nobody can modify the JSOn return value unless they hack my server, in which case I would have a much larger problem anyway.
And if they do it locally, JavaScript only executes in their browser.
So I fail to see where the problem is.
Can anybody shed some light on this, using this concrete example?
function OnWebMethodSucceeded(JSONstrWebMethodReturnValue)
{
var result=eval('(' + JSONstrWebMethodReturnValue + ')')
... // Adding result.xy to a table
}

The fundamental issue is that eval can run any JavaScript, not just deserialize JSON-formatted data. That's the risk when using it to process JSON from an untrusted or semi-trusted source. The frequent trick of wrapping the JSON in parentheses is not sufficient to ensure that arbitrary JavaScript isn't executed. Consider this "JSON" which really isn't:
function(){alert('Hi')})(
If you had that in a variable x and did this:
var result = eval("(" + x + ")");
...you'd see an alert -- the JavaScript ran. Security issue.
If your data is coming from a trusted source (and it sounds like it is), I wouldn't worry about it too much. That said, you might be interested in Crockford's discussion here (Crockford being the inventor of JSON and a generally-knowledgeable JavaScript person). Crockford also provides at least three public domain parsers on this page you might consider using: His json2.js parser and stringifier, which when minified is only 2.5k in size, but which still uses eval (it just takes several precautions first); his json_parse.js, which is a recursive-descent parser not using eval; and his json_parse_state.js, a state machine parser (again not using eval). So you get to pick your poison. (Shout out to Camilo Martin for pointing out those last two alternatives.)

Increasingly, JSON parsing and encoding is available natively in modern browsers, [wikipedia reference] This gives your application secure JSON functionality without needing to load an additional library.
You can test for native JSON support by doing something like this:
var native_JSON_exists = typeof window.JSON === 'object';
You should load up a JSON parsing library like Douglas Crockford's one (linked by T.J. Crowder, above) or functionality available via a framewok for browsers that don't have native support. (But you should at least use native JSON in browsers that support it, to protect users lucky enough to have modern browsers)
Bear in mind, JSON is a subset of JavaScript's syntax so strings that work in an JavaScript eval statement may not work in proper JSON parsing. You can test your JSON strings for errors using JSLint (http://www.jslint.com/).

Why is JSON important?

I've only recently heard about JSON (Javascript Object Notation).
Can anybody explain why it is considered (by some websites/blogs/etc) to be important?
We already have XML, why is JSON better (apart from being 'native to Javascript')?
Edit: Hmm, the main answer theme seems to be 'it is smaller'. However, the fact that it allows data fetching across domains, seems important to me. Or is this in practice not (yet) much used?

XML has several drawbacks:
It's heavy!
It provides a hierarchical representation of content which is not exactly the same as (but pretty much similar to) Javascript object model.
Javascript is available everywhere. Without any external parsers, you can process JSONs directly with JS interpreter.
Clearly it's not meant to replace XML completely. For JS based Web apps, its advantages can be useful.

JSON is generally much smaller than its XML equivalent. Smaller transfer means faster transfer, which results in a better user experience.

JSON is much more concise. XML:
<person>
<name>John Doe</name>
<tags>
<tag>friend</tag>
<tag>male</tag>
</tags>
</person>
JSON:
{"name": "John Doe", "tags": ["friend", "male"]}
There's fewer overlapping features, too. For example, in XML there's tension between choosing to use elements (as above), versus attributes (<person name="John Doe">).

JSON came into popular use primarily because it offers a way to circumvent the same-origin policy used in web browsers and thereby allow mashups.
Let's say you're writing a web service on domain A. You can't load XML data from domain B and parse it because the only way to do that would be XMLHttpRequest, and XMLHttpRequest was originally limited by the same-origin policy to talking to only URLs at the same domain as the containing page.
It turns out that for a variety of reasons, you are allowed to request <script> tags across origins. Clever people realized this was a good way to work around the limitation with XMLHttpRequest. Instead of the server returning XML, it can return a series of JavaScript object and array literals.
(bonus question left as an exercise to the reader: why is <script src="..."> allowed across domains without server opt-in but XHR isn't?)
Of course, returning a <script> which consists of nothing more than object literals is not useful because without assigning the values to some variable, you can't do anything with it. Thus, most services use a variant of JSON, called JSONP (http://bob.pythonmac.org/archives/2005/12/05/remote-json-jsonp/).
With the rise in popularity of mashups, people realized that JSON was a convenient data interchange format in general, especially when JavaScript is one end of the channel. For example, JSON is used extensively in Chromium, even in cases where C++ is on both sides. It's just a nice lightweight way to represent simple data, that good parsers exist for in many languages.
Amusingly, using <script> tags to do mashups is incredibly insecure because it is essentially XSS'ing yourself on purpose. So native JSON (http://ejohn.org/blog/native-json-support-is-required/) had to be introduced, which obviates the original benefits of the format. But by that time, it was already super popular :)

If you are working in Javascript, it is much easier to us JSON. This is because JSON can be directly evaluated into a Javascript object, which is much easier to work with than the DOM.
Borrowing and slightly altering the XML and JSON from above
XML:
<person>
<name>John Doe</name>
<tag>friend</tag>
<tag>male</tag>
</person>
JSON:
{ person: {"name": "John Doe", "tag": ["friend", "male"]} }
If you wanted to get the second tag object with XML, you'd need to use the powerful but verbose DOM apis:
var tag2=xmlObj.getElementsByTagName("person")[0].getElementsByTagName("tag")[1];
Whereas with a Javascript object that came in via JSON, you could simply use:
var tag2=jsonObj.person.tag[1];
Of course, Jquery makes the DOM example much simpler:
var tag2=$("person tag",xmlObj).get(1);
However, JSON just "fits" in a Javascript world. If you work with it for a while, you will find that you have much less mental overhead than involving XML based data.
All the above examples ignore the possibility that one or more nodes are available, duplicated, or the possibility that the node has just one or no children. However, to illustrate the native-ness of JSON, to do this with the jsonObj, you'd just have to:
var tag2=(jsonObj.person && jsonObj.person.tags && jsonObj.person.tags.sort && jsonObj.person.tags.length==2 ? jsonObj.person.tags[1] : null);
(some people might not like that long of ternary, but it works). But XML would be (in my opinion) nastier (I don't think you'd want to go the ternary approach because you'd keep calling the dom methods which may have to do the work over again depending on implementation):
var tag2=null;
var persons=xmlObj.getElementsByTagName("person");
if(persons.length==1) {
var tags=persons[0].getElementsByTagName("tag");
if(tags.length==2) { tag2=tags[1]; }
}
Jquery (untested):
var tag2=$("person:only-child tag:nth-child(1)",xmlObj).get(0);

These web pages may help:
JSON - The Fat Free alternative to xml
Why JSON is Important to You!

It depends on what you are going to do. There are a lot of answers here that prefer JSON over XML. If you take a deeper look there isn't a big difference.
If you have a tree of objects you get only tree of javascript objects back. If you take a look at the tension to use OOP style access than turns back on you. Assume you have an object of type A, B ,C that are constructed in a tree. You can easily enable them to be serialzed to JSON. If you read them back in you only get a tree of javascript objects. To reconstruct your A, B, C you have to stuff the values manually into manually created objects or you doing some hacks. Sound like parsing XML and creating objects? Well, yes :)
This days only the newest browsers come with native support for JSON. To support more browsers you have two options: a) you load a json paraser in javascript that helps you parsing. So, how fat does this sound regarding fatreeness? The other option as I often see is eval. You can just do eval() on a JSON String to get the objects. But that introduces a whole new set of security problems. JSON is specified so it can't contain functions. If you are not checking the objects for function someone can easily send you code that is being executed.
So it might depend on what you like more: JSON or XML. The biggest difference is propably the ways of accessing things, be it script tags XMLHTTPRequest... I would decide upon this what to use. In my opinion if there would be proper support for XPATH in the browsers I would often decide for XML to use. But the fashion is directed towards json and loading additional json parsers in javascript.
If you can't decide and you know you need something really powerful you ight have to take a look at YAML. Reading about YAML is very interesting to get more insight in the topic. But it really depends on what you are trying to do.

JSON is a way to serialize data in Javascript objects. The syntax is taken from the language, so it should be familiar to the developer dealing with Javascript, and -- being the stringification of an object -- it's a more-natural serialization method for interaction within the browser than a full-fledged XML derivative (with all the arbitrary design decisions that implies).
It's light and intuitive.

JSON's a text-based object serialization format that's more lightweight than XML and that directly integrates with JavaScript's object model. That's most of its advantages right there.
Its disadvantages (compared to XML) are, roughly: fewer available tools (forget about standard validation and/or transformation, to say nothing of syntax highlighting or well-formedness checking in most editors), less likely to be human-readable (there's huge variations in the readability of both JSON and XML, so that's a necessarily fuzzy statement), tight integration with JavaScript makes for not-so-tight integration with other environments.

It's not that it is better, but that it can tie many things together to allow seamless data transfer without manual parsing!
For example javascript -> C# web service -> javascript

Develop Reference

JavaScript is the programming language of the Web.