Using JavaScript eval to parse JSON

Using JavaScript eval to parse JSON - javascript

Question: I'm using eval to parse a JSON return value from one of my WebMethods.
I prefer not to add jquery-json because the transfer volume is already quite large.
So I parse the JSON return value with eval.
Now rumors go that this is insecure. Why ?
Nobody can modify the JSOn return value unless they hack my server, in which case I would have a much larger problem anyway.
And if they do it locally, JavaScript only executes in their browser.
So I fail to see where the problem is.
Can anybody shed some light on this, using this concrete example?
function OnWebMethodSucceeded(JSONstrWebMethodReturnValue)
{
var result=eval('(' + JSONstrWebMethodReturnValue + ')')
... // Adding result.xy to a table
}

The fundamental issue is that eval can run any JavaScript, not just deserialize JSON-formatted data. That's the risk when using it to process JSON from an untrusted or semi-trusted source. The frequent trick of wrapping the JSON in parentheses is not sufficient to ensure that arbitrary JavaScript isn't executed. Consider this "JSON" which really isn't:
function(){alert('Hi')})(
If you had that in a variable x and did this:
var result = eval("(" + x + ")");
...you'd see an alert -- the JavaScript ran. Security issue.
If your data is coming from a trusted source (and it sounds like it is), I wouldn't worry about it too much. That said, you might be interested in Crockford's discussion here (Crockford being the inventor of JSON and a generally-knowledgeable JavaScript person). Crockford also provides at least three public domain parsers on this page you might consider using: His json2.js parser and stringifier, which when minified is only 2.5k in size, but which still uses eval (it just takes several precautions first); his json_parse.js, which is a recursive-descent parser not using eval; and his json_parse_state.js, a state machine parser (again not using eval). So you get to pick your poison. (Shout out to Camilo Martin for pointing out those last two alternatives.)

Increasingly, JSON parsing and encoding is available natively in modern browsers, [wikipedia reference] This gives your application secure JSON functionality without needing to load an additional library.
You can test for native JSON support by doing something like this:
var native_JSON_exists = typeof window.JSON === 'object';
You should load up a JSON parsing library like Douglas Crockford's one (linked by T.J. Crowder, above) or functionality available via a framewok for browsers that don't have native support. (But you should at least use native JSON in browsers that support it, to protect users lucky enough to have modern browsers)
Bear in mind, JSON is a subset of JavaScript's syntax so strings that work in an JavaScript eval statement may not work in proper JSON parsing. You can test your JSON strings for errors using JSLint (http://www.jslint.com/).

Related

Does JSON.parse() use eval() internally? [duplicate]

This question already has answers here:
What is JSON.parse written in / Is it open source?
(4 answers)
Closed 9 years ago.
Does JSON.parse in modern browsers use eval() internally for evaluating and executing the dynamic code?
Because I have been looking through Douglas Crockford's JSON library. It also uses eval() when using parse() but after preprocessing prior to the actual evaluation. Such as:-
A wall against Unicode characters in the code.
A code shows malicious intent.
Do the modern browsers which supports JSON.parse natively perform this or they follow other protocols?

No, JSON.parse() doesn't Use eval()
This is by design, as eval() being able to execute any arbitrary JavaScript code you feed it, it could execute things you wouldn't want it to. So JSON.parse() does what it says on the tin: it actually parses the whole string and reconstructs and entire object tree.
JSON.parse is usually delegated to an internal function implemented with "native" code, where "native" means whatever is considered "native" in the context of your browser's javascript engine (could be compiled machine code, could be bytecode for a VM, etc...). I don't think there's any strong requirement on that.
Differences in the Implementations?
JSON (the notation) itself is codified by the RFC4627.
Regarding the implemetation of the JSON object and its methods, all modern browsers implementing should behave the same, as they should follow the same specifications for ECMAScript 5's JSON object. However, there's always the chance for potential defects. For instance, V8 originally contained this nasty bug.
Also, note that the implementation listed in comments above are for you to add JSON.parse() support to browsers that do not support it natively (also known as "these damn old browsers you sometimes need to support"). But it doesn't mean that it's necessarily how they implemented it.
For instance, for Google's V8 implementation used in Chrome, see json.js which invokes native code from json_parser.h.

It would be a very funny thing to do, if you think about it.
To understand why, see if this analogy helps: you're traveling with your boss to a country where you speak the language but she doesn't. Since you're fluent, you will serve two roles: as her assistant (doing tasks for her) as well as her translator (telling her what things mean).
So you have these two jobs, which are complementary. Your boss could tell you to do something--in any language you both understand (say, English)--as well as ask you to tell her what something says, like a sign or a document. She could even do both: hand you a set of instructions written in this other language and say, "This was given to me by someone I trust. Please do everything it says here."
In this analogy, reading signs or documents to your boss is like JSON.parse. Your boss handing you instructions and telling you to do everything they say is like eval.
If JavaScript engines used eval internally for JSON.parse, that would be analogous to your boss asking you what a document says, and you choosing to act out everything written in the document in order to explain it to her. Instead of just reading it.

Flash Twitter API with JSON

I have read a lot about parsing JSON with Actionscript. Originally it was said to use this library. http://code.google.com/p/as3corelib/ but it seems Flash Player 11 has native support for it now.
My problem is that I cannot find examples or help that takes you from beginning to end of the process. Everything I have read seems to start in the middle. I have no real experience with JSON so this is a problem. I don't even know how to point ActionScript to the JSON file it needs to read.
I have a project with a tight deadline that requires me to read twitter through JSON. I need to get the three most recent tweets, along with the user who posted it, their twitter name and the time those tweets were posted.
The back end to this is already set up I believe by the development team here, therefor my JSON files or XML just needs to be pointed to and then I need to display the values in the interface text boxes I have already designed and created.
Any help will be greatly appreciated...I do know that there are a lot of threads on here I just do not understand them as they all have some understanding of it to begin with.

You need to:
Load the data, whatever it is.
Parse the data from a particular format.
For this you would normally:
Use URLLoader class to load any data. (Just go to the language reference and look into example of how to use this class).
Use whatever parser to parse the particular format that you need. http://help.adobe.com/en_US/FlashPlatform/beta/reference/actionscript/3/JSON.html this is the reference to JSON API, it also shows usage examples. I'm not aware of these API being in production version of the player, still there might be quite a bit of FP 10.X players out there, so I'd have a fallback JSON parser, but I would recommend using this library: http://www.blooddy.by/en/crypto/ over as3corelib because it is faster. The built-in API are no different from those you would find in browser, so if you look up JSON JavaScript entries, the use should be in general similar to Flash.
After you parse JSON format, you will end up with a number of objects of the following types: Object, Array, Boolean, Number, String. It has also literals to mean null and undefined. Basically, you will be working with native to Flash data structures, you only should take extra care because they will be dynamically constructed, meaning you may not make assumption about existence of parts of the data - you must always check the availability.

wvxvw's answer is good, but I think skips over a to be desired explanation of what JSON itself is. JSON is plain text, javascript object notation, when you read the text on screen it looks something like this
http://www.json.org/example.html
you can see a side by side JSON and XML (both plain text formats) essentially JSON is a bunch of name value pairs.
When you use JSON.parse("your JSON string goes here") it will do the conversions to AS3 "dynamic objects" which are just plain objects (whose properties can be assigned without previously being defined, hence dynamic). But to make a long story short, take the example you see in the link above, copy and paste the JSON as a string variable in AS3, use
var str:String = '{"glossary": {"title": "example glossary","GlossDiv": {"title": "S","GlossList": {"GlossEntry": {"ID": "SGML","SortAs": "SGML","GlossTerm": "Standard Generalized Markup Language","Acronym": "SGML","Abbrev": "ISO 8879:1986","GlossDef": {"para": "A meta-markup language, used to create markup languages such as DocBook.","GlossSeeAlso": ["GML", "XML"]},"GlossSee": "markup"}}}}}';
var test:Object = JSON.parse(str);
method on the string, store it in a variable and use the debugger to see what the resulting object is. As far as I know there's really nothing else to JSON it's simply this format for storing data (you can't use E4X on it since it's not XML based and because of that it's slightly more concise than XML, no closing tags, but in my opionion slightly less readable... but is valid javascript). For a nice break-down of the performance gains/losses between AMF, JSON and XML check out this page: http://www.jamesward.com/census2/ Though many times you don't have a choice with regard to the delivery message format or protocol being used if you're not building the service, it's good to understand what the performance costs of them are.

lightweight javascript to javascript parser

How would I go about writing a lightweight javascript to javascript parser. Something simple that can convert some snippets of code.
I would like to basically make the internal scope objects in functions public.
So something like this
var outer = 42;
window.addEventListener('load', function() {
var inner = 42;
function magic() {
var in_magic = inner + outer;
console.log(in_magic);
}
magic();
}, false);
Would compile to
__Scope__.set('outer', 42);
__Scope__.set('console', console);
window.addEventListener('load', constructScopeWrapper(__Scope__, function(__Scope__) {
__Scope__.set('inner', 42);
__Scope__.set('magic',constructScopeWrapper(__Scope__, function _magic(__Scope__) {
__Scope__.set('in_magic', __Scope__.get('inner') + __Scope__.get('outer'));
__Scope__.get('console').log(__Scope__.get('in_magic'));
}));
__Scope__.get('magic')();
}), false);
Demonstation Example
Motivation behind this is to serialize the state of functions and closures and keep them synchronized across different machines (client, server, multiple servers). For this I would need a representation of [[Scope]]
Questions:
Can I do this kind of compiler without writing a full JavaScript -> (slightly different) JavaScript compiler?
How would I go about writing such a compiler?
Can I re-use existing js -> js compilers?

I don't think your task is easy or short given that you want to access and restore all the program state. One of the issues is that you might have to capture the program state at any moment during a computation, right? That means the example as shown isn't quite right; that captures state sort of before execution of that code (except that you've precomputed the sum that initializes magic, and that won't happen before the code runs for the original JavaScript). I assume you might want to capture the state at any instant during execution.
The way you've stated your problem, is you want a JavaScript parser in JavaScript.
I assume you are imagining that your existing JavaScript code J, includes such a JavaScript parser and whatever else is necessary to generate your resulting code G, and that when J starts up it feeds copies of itself to G, manufacturing the serialization code S and somehow loading that up.
(I think G is pretty big and hoary if it can handle all of Javascript)
So your JavaScript image contains J, big G, S and does an expensive operation (feed J to G) when it starts up.
What I think might serve you better is a tool G that processes your original JavaScript code J offline, and generates program state/closure serialization code S (to save and restore that state) that can be added to/replace J for execution. J+S are sent to the client, who never sees G or its execution. This decouples the generation of S from the runtime execution of J, saving on client execution time and space.
In this case, you want a tool that will make generation of such code S easiest. A pure JavaScript parser is a start but isn't likely enough; you'll need symbol table support to know which function code is connected a function call F(...), and which variable definition in which scope corresponds to assignments or accesses to a variable V. You may need to actually modify your original code J to insert points of access where the program state can be captured. You may need flow analysis to find out where some values went. Insisting all of this in JavaScript narrows your range of solutions.
For these tasks, you will likely find a program transformation tool useful. Such tools contain parsers for the langauge of interest, build ASTs representing the program, enable the construction of identifier-to-definition maps ("symbol tables"), can carry out modifications to the ASTs representing insertion of access points, or synthesis of ASTs representing your demonstration example, and then regenerate valid JavaScript code containing the modified J and the additions S.
Of all the program transformation systems that I know about (which includes all the ones at the Wikipedia site), none are implemented in JavaScript.
Our DMS Software Reengineering Toolkit is such a program transformation system offering all the features I just described. (Yes, its big and hoary; it has to be to handle the complexities of real computer languages). It has a JavaScript front end that contains a complete JavaScript parser to ASTs, and the machinery to regenerate JavaScript code from modified or synthesized ASTs. (Also big and hoary; good thing that hoary + hoary is still just hoary). Should it be useful, DMS also provides support for building control and dataflow analysis.

If you want something with a simple interface, you could try node-burrito: https://github.com/substack/node-burrito
It generates an AST using the uglify-js parser and then recursively walks the nodes. All you have to do is give a single callback which tests each node. You can alter the ones you need to change, and it outputs the resulting code.

I'd try to look for an existing parser to modify. Perhaps you could adapt JSLint/JSHint?

There is a problem with the rewriting above, you're not hoisting the initialization of magic to the top of the scope.
There's a number of projects out there that parse JavaScript.
Crock's Pratt parser which works well on JavaScript that fits within "The good parts" and less well on other JS.
The es-lab parser based on ometa which handles the full grammar including a lot of corner cases that Crock's parser misses. It may not perform as well as Crock's.
narcissus parser and evaluator. I don't have much experience with this.
There are also a number of high-quality lexers for JavaScript that let you manipulate JS at the token level. This can be tougher than it sounds though since JavaScript is not lexically regular, and predicting semicolon insertion is difficult without a full parse.
My es5-lexer is a carefully constructed and efficient lexer for EcmaScript 5 that provides the ability to tokenize JavaScript. It is heuristic where JavaScript's grammar is not lexically regular but the heuristic is very good and it provides a means to transform a token stream so that an interpreter is guaranteed to interpret it the way the lexer interpreted the tokens so if you don't trust your input, you can still be sure that the interpretation underlying the security transformations is sound even if not correct according to the spec for some bizarre inputs.

Your problem seams to be in same family of problems as what is solved with the JS Opfuscators and JS Compressors -- they as well as you need to be able to parse and reformat the JS to an equivalent script;
There was a good discussion on obfuscators here and the possible solution to your problem could be to leverage the parse and generator part from one of the FOSS versions.
One callout, your example code does not take into account the scopes of the variables you want to set/get and that will eventually become a problem that you will have to solve.
Addition
Given the scope problem for closure defined functions, you are probably unlikely to be able to solve this problem as a static parsing problem, as the scope variables outside the closure will have to be imported/exported to resolve/save and re-instantiate scope. Hence you may need to dig into the evaluation engine itself, and perhaps get the V8 engine and make a hack to the interpreter itself -- that is assuming that you do not need this to be generic cross all script engines and that you can tie it down to a single implementation which you control.

Why not eval() JSON?

As far as I know it is considered bad practice to eval() JSON objects in JavaScript, because of security. I can understand this concern if the JSON comes from another server.
But if the JSON is provided by my own server and is created using PHP's json_encode (let us assume it is not buggy), is it legitimate to simply use eval() to read the JSON in JS or are there any security problem I currently can't think of?
I really don't want to deal with dynamically loading a JSON parser and would be glad to simply use eval().
PS: I will obviously use the native JSON object if it is available, but want to fall back to eval() for IE/Opera.

In your scenario, the question becomes, where is PHP getting the javascript to execute from? Is that channel secure, and free from potential user manipulation? What if you don't control that channel directly?

There are a number of ways that your security may be compromised.
Man in the middle attacks could theoretically alter the contents of data being delivered to the client.
Your server traffic could be intercepted elsewhere and different content could be provided (not quite the same as a MIM attack)
Your server could be compromised and the data source could be tampered with.
and these are just the simple examples. XSS is nasty.
"an ounce of prevention is worth a pound of cure"

Besides the obvious security issues:
Native JSON is faster
You don't need to "load" a JSON parser it's just another function call to the JavaScript engine

Tip:
in asp.net using JSON is considered bad becuase parsing of DateTime differs between the server and the client so we use a special function to deserialize the date in javascript. I'm not sure if PHP has the same issue but its worth mentioning though.

check out this:http://blog.mozilla.com/webdev/2009/02/12/native-json-in-firefox-31/
so at least for firefox you can use the built in json parser

Seriously? Some of the guys here are paranoid. If you're delivering the JSON and you know it's safe, it's ok to fallback(*) to eval(); instead of a js lib for IE. After all, IE users have much more to worry about.
And the man-in-the-middle argument is bullsh*t.
(*) the words fallback and safe are in bold because some people here didn't see them.

Why is JSON important?

I've only recently heard about JSON (Javascript Object Notation).
Can anybody explain why it is considered (by some websites/blogs/etc) to be important?
We already have XML, why is JSON better (apart from being 'native to Javascript')?
Edit: Hmm, the main answer theme seems to be 'it is smaller'. However, the fact that it allows data fetching across domains, seems important to me. Or is this in practice not (yet) much used?

XML has several drawbacks:
It's heavy!
It provides a hierarchical representation of content which is not exactly the same as (but pretty much similar to) Javascript object model.
Javascript is available everywhere. Without any external parsers, you can process JSONs directly with JS interpreter.
Clearly it's not meant to replace XML completely. For JS based Web apps, its advantages can be useful.

JSON is generally much smaller than its XML equivalent. Smaller transfer means faster transfer, which results in a better user experience.

JSON is much more concise. XML:
<person>
<name>John Doe</name>
<tags>
<tag>friend</tag>
<tag>male</tag>
</tags>
</person>
JSON:
{"name": "John Doe", "tags": ["friend", "male"]}
There's fewer overlapping features, too. For example, in XML there's tension between choosing to use elements (as above), versus attributes (<person name="John Doe">).

JSON came into popular use primarily because it offers a way to circumvent the same-origin policy used in web browsers and thereby allow mashups.
Let's say you're writing a web service on domain A. You can't load XML data from domain B and parse it because the only way to do that would be XMLHttpRequest, and XMLHttpRequest was originally limited by the same-origin policy to talking to only URLs at the same domain as the containing page.
It turns out that for a variety of reasons, you are allowed to request <script> tags across origins. Clever people realized this was a good way to work around the limitation with XMLHttpRequest. Instead of the server returning XML, it can return a series of JavaScript object and array literals.
(bonus question left as an exercise to the reader: why is <script src="..."> allowed across domains without server opt-in but XHR isn't?)
Of course, returning a <script> which consists of nothing more than object literals is not useful because without assigning the values to some variable, you can't do anything with it. Thus, most services use a variant of JSON, called JSONP (http://bob.pythonmac.org/archives/2005/12/05/remote-json-jsonp/).
With the rise in popularity of mashups, people realized that JSON was a convenient data interchange format in general, especially when JavaScript is one end of the channel. For example, JSON is used extensively in Chromium, even in cases where C++ is on both sides. It's just a nice lightweight way to represent simple data, that good parsers exist for in many languages.
Amusingly, using <script> tags to do mashups is incredibly insecure because it is essentially XSS'ing yourself on purpose. So native JSON (http://ejohn.org/blog/native-json-support-is-required/) had to be introduced, which obviates the original benefits of the format. But by that time, it was already super popular :)

If you are working in Javascript, it is much easier to us JSON. This is because JSON can be directly evaluated into a Javascript object, which is much easier to work with than the DOM.
Borrowing and slightly altering the XML and JSON from above
XML:
<person>
<name>John Doe</name>
<tag>friend</tag>
<tag>male</tag>
</person>
JSON:
{ person: {"name": "John Doe", "tag": ["friend", "male"]} }
If you wanted to get the second tag object with XML, you'd need to use the powerful but verbose DOM apis:
var tag2=xmlObj.getElementsByTagName("person")[0].getElementsByTagName("tag")[1];
Whereas with a Javascript object that came in via JSON, you could simply use:
var tag2=jsonObj.person.tag[1];
Of course, Jquery makes the DOM example much simpler:
var tag2=$("person tag",xmlObj).get(1);
However, JSON just "fits" in a Javascript world. If you work with it for a while, you will find that you have much less mental overhead than involving XML based data.
All the above examples ignore the possibility that one or more nodes are available, duplicated, or the possibility that the node has just one or no children. However, to illustrate the native-ness of JSON, to do this with the jsonObj, you'd just have to:
var tag2=(jsonObj.person && jsonObj.person.tags && jsonObj.person.tags.sort && jsonObj.person.tags.length==2 ? jsonObj.person.tags[1] : null);
(some people might not like that long of ternary, but it works). But XML would be (in my opinion) nastier (I don't think you'd want to go the ternary approach because you'd keep calling the dom methods which may have to do the work over again depending on implementation):
var tag2=null;
var persons=xmlObj.getElementsByTagName("person");
if(persons.length==1) {
var tags=persons[0].getElementsByTagName("tag");
if(tags.length==2) { tag2=tags[1]; }
}
Jquery (untested):
var tag2=$("person:only-child tag:nth-child(1)",xmlObj).get(0);

These web pages may help:
JSON - The Fat Free alternative to xml
Why JSON is Important to You!

It depends on what you are going to do. There are a lot of answers here that prefer JSON over XML. If you take a deeper look there isn't a big difference.
If you have a tree of objects you get only tree of javascript objects back. If you take a look at the tension to use OOP style access than turns back on you. Assume you have an object of type A, B ,C that are constructed in a tree. You can easily enable them to be serialzed to JSON. If you read them back in you only get a tree of javascript objects. To reconstruct your A, B, C you have to stuff the values manually into manually created objects or you doing some hacks. Sound like parsing XML and creating objects? Well, yes :)
This days only the newest browsers come with native support for JSON. To support more browsers you have two options: a) you load a json paraser in javascript that helps you parsing. So, how fat does this sound regarding fatreeness? The other option as I often see is eval. You can just do eval() on a JSON String to get the objects. But that introduces a whole new set of security problems. JSON is specified so it can't contain functions. If you are not checking the objects for function someone can easily send you code that is being executed.
So it might depend on what you like more: JSON or XML. The biggest difference is propably the ways of accessing things, be it script tags XMLHTTPRequest... I would decide upon this what to use. In my opinion if there would be proper support for XPATH in the browsers I would often decide for XML to use. But the fashion is directed towards json and loading additional json parsers in javascript.
If you can't decide and you know you need something really powerful you ight have to take a look at YAML. Reading about YAML is very interesting to get more insight in the topic. But it really depends on what you are trying to do.

JSON is a way to serialize data in Javascript objects. The syntax is taken from the language, so it should be familiar to the developer dealing with Javascript, and -- being the stringification of an object -- it's a more-natural serialization method for interaction within the browser than a full-fledged XML derivative (with all the arbitrary design decisions that implies).
It's light and intuitive.

JSON's a text-based object serialization format that's more lightweight than XML and that directly integrates with JavaScript's object model. That's most of its advantages right there.
Its disadvantages (compared to XML) are, roughly: fewer available tools (forget about standard validation and/or transformation, to say nothing of syntax highlighting or well-formedness checking in most editors), less likely to be human-readable (there's huge variations in the readability of both JSON and XML, so that's a necessarily fuzzy statement), tight integration with JavaScript makes for not-so-tight integration with other environments.

It's not that it is better, but that it can tie many things together to allow seamless data transfer without manual parsing!
For example javascript -> C# web service -> javascript

Develop Reference

JavaScript is the programming language of the Web.