What is the minimum valid JSON? - javascript

I've carefully read the JSON description http://json.org/ but I'm not sure I know the answer to the simple question. What strings are the minimum possible valid JSON?
"string" is the string valid JSON?
42 is the simple number valid JSON?
true is the boolean value a valid JSON?
{} is the empty object a valid JSON?
[] is the empty array a valid JSON?

At the time of writing, JSON was solely described in RFC4627. It describes (at the start of "2") a JSON text as being a serialized object or array.
This means that only {} and [] are valid, complete JSON strings in parsers and stringifiers which adhere to that standard.
However, the introduction of ECMA-404 changes that, and the updated advice can be read here. I've also written a blog post on the issue.
To confuse the matter further however, the JSON object (e.g. JSON.parse() and JSON.stringify()) available in web browsers is standardised in ES5, and that clearly defines the acceptable JSON texts like so:
The JSON interchange format used in this specification is exactly that described by RFC 4627 with two exceptions:
The top level JSONText production of the ECMAScript JSON grammar may consist of any JSONValue rather than being restricted to being a JSONObject or a JSONArray as specified by RFC 4627.
snipped
This would mean that all JSON values (including strings, nulls and numbers) are accepted by the JSON object, even though the JSON object technically adheres to RFC 4627.
Note that you could therefore stringify a number in a conformant browser via JSON.stringify(5), which would be rejected by another parser that adheres to RFC4627, but which doesn't have the specific exception listed above. Ruby, for example, would seem to be one such example which only accepts objects and arrays as the root. PHP, on the other hand, specifically adds the exception that "it will also encode and decode scalar types and NULL".

There are at least four documents which can be considered JSON standards on the Internet. The RFCs referenced all describe the mime type application/json. Here is what each has to say about the top-level values, and whether anything other than an object or array is allowed at the top:
RFC-4627: No.
A JSON text is a sequence of tokens. The set of tokens includes six
structural characters, strings, numbers, and three literal names.
A JSON text is a serialized object or array.
JSON-text = object / array
Note that RFC-4627 was marked "informational" as opposed to "proposed standard", and that it is obsoleted by RFC-7159, which in turn is obsoleted by RFC-8259.
RFC-8259: Yes.
A JSON text is a sequence of tokens. The set of tokens includes six
structural characters, strings, numbers, and three literal names.
A JSON text is a serialized value. Note that certain previous
specifications of JSON constrained a JSON text to be an object or an
array. Implementations that generate only objects or arrays where a
JSON text is called for will be interoperable in the sense that all
implementations will accept these as conforming JSON texts.
JSON-text = ws value ws
RFC-8259 is dated December 2017 and is marked "INTERNET STANDARD".
ECMA-262: Yes.
The JSON Syntactic Grammar defines a valid JSON text in terms of tokens defined by the JSON lexical
grammar. The goal symbol of the grammar is JSONText.
Syntax
JSONText :
JSONValue
JSONValue :
JSONNullLiteral
JSONBooleanLiteral
JSONObject
JSONArray
JSONString
JSONNumber
ECMA-404: Yes.
A JSON text is a sequence of tokens formed from Unicode code points that conforms to the JSON value
grammar. The set of tokens includes six structural tokens, strings, numbers, and three literal name tokens.

According to the old definition in RFC 4627 (which was obsoleted in March 2014 by RFC 7159), those were all valid "JSON values", but only the last two would constitute a complete "JSON text":
A JSON text is a serialized object or array.
Depending on the parser used, the lone "JSON values" might be accepted anyway. For example (sticking to the "JSON value" vs "JSON text" terminology):
the JSON.parse() function now standardised in modern browsers accepts any "JSON value"
the PHP function json_decode was introduced in version 5.2.0 only accepting a whole "JSON text", but was amended to accept any "JSON value" in version 5.2.1
Python's json.loads accepts any "JSON value" according to examples on this manual page
the validator at http://jsonlint.com expects a full "JSON text"
the Ruby JSON module will only accept a full "JSON text" (at least according to the comments on this manual page)
The distinction is a bit like the distinction between an "XML document" and an "XML fragment", although technically <foo /> is a well-formed XML document (it would be better written as <?xml version="1.0" ?><foo />, but as pointed out in comments, the <?xml declaration is technically optional).

JSON stands for JavaScript Object Notation. Only {} and [] define a Javascript object. The other examples are value literals. There are object types in Javascript for working with those values, but the expression "string" is a source code representation of a literal value and not an object.
Keep in mind that JSON is not Javascript. It is a notation that represents data. It has a very simple and limited structure. JSON data is structured using {},:[] characters. You can only use literal values inside that structure.
It is perfectly valid for a server to respond with either an object description or a literal value. All JSON parsers should be handle to handle just a literal value, but only one value. JSON can only represent a single object at a time. So for a server to return more than one value it would have to structure it as an object or an array.

The ecma specification might be useful for reference:
http://www.ecma-international.org/ecma-262/5.1/
The parse function parses a JSON text (a JSON-formatted String) and produces an ECMAScript value. The
JSON format is a restricted form of ECMAScript literal. JSON objects are realized as ECMAScript objects.
JSON arrays are realized as ECMAScript arrays. JSON strings, numbers, booleans, and null are realized as
ECMAScript Strings, Numbers, Booleans, and null. JSON uses a more limited set of white space characters
than WhiteSpace and allows Unicode code points U+2028 and U+2029 to directly appear in JSONString literals
without using an escape sequence. The process of parsing is similar to 11.1.4 and 11.1.5 as constrained by
the JSON grammar.
JSON.parse("string"); // SyntaxError: Unexpected token s
JSON.parse(43); // 43
JSON.parse("43"); // 43
JSON.parse(true); // true
JSON.parse("true"); // true
JSON.parse(false);
JSON.parse("false");
JSON.parse("trueee"); // SyntaxError: Unexpected token e
JSON.parse("{}"); // {}
JSON.parse("[]"); // []

Yes, yes, yes, yes, and yes. All of them are valid JSON value literals.
However, the official RFC 4627 states:
A JSON text is a serialized object or array.
So a whole "file" should consist of an object or array as the outermost structure, which of course can be empty. Yet, many JSON parsers accept primitive values as well for input.

Just follow the railroad diagrams given on the json.org page. [] and {} are the minimum possible valid JSON objects. So the answer is [] and {}.

var x;
console.log(JSON.stringify(x)); // will output "{}"
So your answer is "{}" which denotes an empty object.

Related

TextEncoder / TextDecoder not round tripping

I'm definitely missing something about the TextEncoder and TextDecoder behavior. It seems to me like the following code should round-trip, but it doesn't seem to:
new TextDecoder().decode(new TextEncoder().encode(String.fromCharCode(55296))).charCodeAt(0);
Since I'm just encoding and decoding the string, the char code seems like it should be the same, but this returns 65533 instead of 55296. What am I missing?
Based on some spelunking, the TextEncoder.encode() method appears to take an argument of type USVString, where USV stands for Unicode Scalar Value. According to this page, a USV cannot be a high-surrogate or low-surrogate code point.
Also, according to MDN:
A USVString is a sequence of Unicode scalar values. This definition
differs from that of DOMString or the JavaScript String type in that
it always represents a valid sequence suitable for text processing,
while the latter can contain surrogate code points.
So, my guess is your String argument to encode() is getting converted to a USVString (either implicitly or within encode()). Based on this page, it looks like to convert from String to USVString, it first converts it to a DOMString, and then follows this procedure, which includes replacing all surrogates with U+FFFD, which is the code point you see, 65533, the "Replacement Character".
The reason String.fromCharCode(55296).charCodeAt(0) works I believe is because it doesn't need to do this String -> USVString conversion.
As to why TextEncoder.encode() was designed this way, I don't understand the unicode details well enough to attempt to explain, but I suspect it's to simplify implementation since the only output encoding it supports seems to be UTF-8, in an Uint8Array. I'm guessing requiring a USVString argument without surrogates (instead of a native UTF-16 String possibly with surrogates) simplifies the encoding to UTF-8, or maybe makes some encoding/decoding use cases simpler?
For those (like me) who aren't sure what "unicode surrogates" are:
The problem
The character code 55296 is not a valid character by itself. So this part of the code is already a problem:
String.fromCharCode(55296)
Since there is no valid character at that charCode, the .fromCharCode function returns the error character "�" instead, which happens to have the code 65533.
Codes like 55296 are only valid as the first element of a pair of codes. Pairs of codes are used to represent the characters that didn't fit in Unicode's Basic Multilingual Plane. (There are a lot of characters outside the Basic Multilingual Plane, so they need two 16-bit numbers to encode them.)
For example, here is a valid use of the code 55296:
console.log(String.fromCharCode(55296, 57091)
It returns the character "𐌃", from the ancient Etruscan alphabet.
The solution
This code will round-trip correctly:
const code = new TextEncoder().encode(String.fromCharCode(55296, 57091));
console.log(new TextDecoder().decode(code).charCodeAt(0)); // Returns 55296
But beware: .charCodeAt only returns the first part of the pair. A safer option might be to use String.codePointAt to convert the character into a single 32-bit code:
const code = new TextEncoder().encode(String.fromCharCode(55296, 57091));
console.log(new TextDecoder().decode(code).codePointAt(0)); // Returns 66307

How do i parse the following JSON object to retrieve relevant fields in Python [duplicate]

My Python program receives JSON data, and I need to get bits of information out of it. How can I parse the data and use the result? I think I need to use json.loads for this task, but I can't understand how to do it.
For example, suppose that I have jsonStr = '{"one" : "1", "two" : "2", "three" : "3"}'. Given this JSON, and an input of "two", how can I get the corresponding data, "2"?
Beware that .load is for files; .loads is for strings. See also: Reading JSON from a file.
Occasionally, a JSON document is intended to represent tabular data. If you have something like this and are trying to use it with Pandas, see Python - How to convert JSON File to Dataframe.
Some data superficially looks like JSON, but is not JSON.
For example, sometimes the data comes from applying repr to native Python data structures. The result may use quotes differently, use title-cased True and False rather than JSON-mandated true and false, etc. For such data, see Convert a String representation of a Dictionary to a dictionary or How to convert string representation of list to a list.
Another common variant format puts separate valid JSON-formatted data on each line of the input. (Proper JSON cannot be parsed line by line, because it uses balanced brackets that can be many lines apart.) This format is called JSONL. See Loading JSONL file as JSON objects.
Sometimes JSON data from a web source is padded with some extra text. In some contexts, this works around security restrictions in browsers. This is called JSONP and is described at What is JSONP, and why was it created?. In other contexts, the extra text implements a security measure, as described at Why does Google prepend while(1); to their JSON responses?. Either way, handling this in Python is straightforward: simply identify and remove the extra text, and proceed as before.
Very simple:
import json
data = json.loads('{"one" : "1", "two" : "2", "three" : "3"}')
print(data['two']) # or `print data['two']` in Python 2
Sometimes your json is not a string. For example if you are getting a json from a url like this:
j = urllib2.urlopen('http://site.com/data.json')
you will need to use json.load, not json.loads:
j_obj = json.load(j)
(it is easy to forget: the 's' is for 'string')
For URL or file, use json.load(). For string with .json content, use json.loads().
#! /usr/bin/python
import json
# from pprint import pprint
json_file = 'my_cube.json'
cube = '1'
with open(json_file) as json_data:
data = json.load(json_data)
# pprint(data)
print "Dimension: ", data['cubes'][cube]['dim']
print "Measures: ", data['cubes'][cube]['meas']
Following is simple example that may help you:
json_string = """
{
"pk": 1,
"fa": "cc.ee",
"fb": {
"fc": "",
"fd_id": "12345"
}
}"""
import json
data = json.loads(json_string)
if data["fa"] == "cc.ee":
data["fb"]["new_key"] = "cc.ee was present!"
print json.dumps(data)
The output for the above code will be:
{"pk": 1, "fb": {"new_key": "cc.ee was present!", "fd_id": "12345",
"fc": ""}, "fa": "cc.ee"}
Note that you can set the ident argument of dump to print it like so (for example,when using print json.dumps(data , indent=4)):
{
"pk": 1,
"fb": {
"new_key": "cc.ee was present!",
"fd_id": "12345",
"fc": ""
},
"fa": "cc.ee"
}
Parsing the data
Using the standard library json module
For string data, use json.loads:
import json
text = '{"one" : "1", "two" : "2", "three" : "3"}'
parsed = json.loads(example)
For data that comes from a file, or other file-like object, use json.load:
import io, json
# create an in-memory file-like object for demonstration purposes.
text = '{"one" : "1", "two" : "2", "three" : "3"}'
stream = io.StringIO(text)
parsed = json.load(stream) # load, not loads
It's easy to remember the distinction: the trailing s of loads stands for "string". (This is, admittedly, probably not in keeping with standard modern naming practice.)
Note that json.load does not accept a file path:
>>> json.load('example.txt')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/lib/python3.8/json/__init__.py", line 293, in load
return loads(fp.read(),
AttributeError: 'str' object has no attribute 'read'
Both of these functions provide the same set of additional options for customizing the parsing process. Since 3.6, the options are keyword-only.
For string data, it is also possible to use the JSONDecoder class provided by the library, like so:
import json
text = '{"one" : "1", "two" : "2", "three" : "3"}'
decoder = json.JSONDecoder()
parsed = decoder.decode(text)
The same keyword parameters are available, but now they are passed to the constructor of the JSONDecoder, not the .decode method. The main advantage of the class is that it also provides a .raw_decode method, which will ignore extra data after the end of the JSON:
import json
text_with_junk = '{"one" : "1", "two" : "2", "three" : "3"} ignore this'
decoder = json.JSONDecoder()
# `amount` will count how many characters were parsed.
parsed, amount = decoder.raw_decode(text_with_junk)
Using requests or other implicit support
When data is retrieved from the Internet using the popular third-party requests library, it is not necessary to extract .text (or create any kind of file-like object) from the Response object and parse it separately. Instead, the Response object directly provides a .json method which will do this parsing:
import requests
response = requests.get('https://www.example.com')
parsed = response.json()
This method accepts the same keyword parameters as the standard library json functionality.
Using the results
Parsing by any of the above methods will result, by default, in a perfectly ordinary Python data structure, composed of the perfectly ordinary built-in types dict, list, str, int, float, bool (JSON true and false become Python constants True and False) and NoneType (JSON null becomes the Python constant None).
Working with this result, therefore, works the same way as if the same data had been obtained using any other technique.
Thus, to continue the example from the question:
>>> parsed
{'one': '1', 'two': '2', 'three': '3'}
>>> parsed['two']
'2'
I emphasize this because many people seem to expect that there is something special about the result; there is not. It's just a nested data structure, though dealing with nesting is sometimes difficult to understand.
Consider, for example, a parsed result like result = {'a': [{'b': 'c'}, {'d': 'e'}]}. To get 'e' requires following the appropriate steps one at a time: looking up the a key in the dict gives a list [{'b': 'c'}, {'d': 'e'}]; the second element of that list (index 1) is {'d': 'e'}; and looking up the 'd' key in there gives the 'e' value. Thus, the corresponding code is result['a'][1]['d']: each indexing step is applied in order.
See also How can I extract a single value from a nested data structure (such as from parsing JSON)?.
Sometimes people want to apply more complex selection criteria, iterate over nested lists, filter or transform the data, etc. These are more complex topics that will be dealt with elsewhere.
Common sources of confusion
JSON lookalikes
Before attempting to parse JSON data, it is important to ensure that the data actually is JSON. Check the JSON format specification to verify what is expected. Key points:
The document represents one value (normally a JSON "object", which corresponds to a Python dict, but every other type represented by JSON is permissible). In particular, it does not have a separate entry on each line - that's JSONL.
The data is human-readable after using a standard text encoding (normally UTF-8). Almost all of the text is contained within double quotes, and uses escape sequences where appropriate.
Dealing with embedded data
Consider an example file that contains:
{"one": "{\"two\": \"three\", \"backslash\": \"\\\\\"}"}
The backslashes here are for JSON's escape mechanism.
When parsed with one of the above approaches, we get a result like:
>>> example = input()
{"one": "{\"two\": \"three\", \"backslash\": \"\\\\\"}"}
>>> parsed = json.loads(example)
>>> parsed
{'one': '{"two": "three", "backslash": "\\\\"}'}
Notice that parsed['one'] is a str, not a dict. As it happens, though, that string itself represents "embedded" JSON data.
To replace the embedded data with its parsed result, simply access the data, use the same parsing technique, and proceed from there (e.g. by updating the original result in place):
>>> parsed['one'] = json.loads(parsed['one'])
>>> parsed
{'one': {'two': 'three', 'backslash': '\\'}}
Note that the '\\' part here is the representation of a string containing one actual backslash, not two. This is following the usual Python rules for string escapes, which brings us to...
JSON escaping vs. Python string literal escaping
Sometimes people get confused when trying to test code that involves parsing JSON, and supply input as an incorrect string literal in the Python source code. This especially happens when trying to test code that needs to work with embedded JSON.
The issue is that the JSON format and the string literal format each have separate policies for escaping data. Python will process escapes in the string literal in order to create the string, which then still needs to contain escape sequences used by the JSON format.
In the above example, I used input at the interpreter prompt to show the example data, in order to avoid confusion with escaping. Here is one analogous example using a string literal in the source:
>>> json.loads('{"one": "{\\"two\\": \\"three\\", \\"backslash\\": \\"\\\\\\\\\\"}"}')
{'one': '{"two": "three", "backslash": "\\\\"}'}
To use a double-quoted string literal instead, double-quotes in the string literal also need to be escaped. Thus:
>>> json.loads('{\"one\": \"{\\\"two\\\": \\\"three\\\", \\\"backslash\\\": \\\"\\\\\\\\\\\"}\"}')
{'one': '{"two": "three", "backslash": "\\\\"}'}
Each sequence of \\\" in the input becomes \" in the actual JSON data, which becomes " (embedded within a string) when parsed by the JSON parser. Similarly, \\\\\\\\\\\" (five pairs of backslashes, then an escaped quote) becomes \\\\\" (five backslashes and a quote; equivalently, two pairs of backslashes, then an escaped quote) in the actual JSON data, which becomes \\" (two backslashes and a quote) when parsed by the JSON parser, which becomes \\\\" (two escaped backslashes and a quote) in the string representation of the parsed result (since now, the quote does not need escaping, as Python can use single quotes for the string; but the backslashes still do).
Simple customization
Aside from the strict option, the keyword options available for json.load and json.loads should be callbacks. The parser will call them, passing in portions of the data, and use whatever is returned to create the overall result.
The "parse" hooks are fairly self-explanatory. For example, we can specify to convert floating-point values to decimal.Decimal instances instead of using the native Python float:
>>> import decimal
>>> json.loads('123.4', parse_float=decimal.Decimal)
Decimal('123.4')
or use floats for every value, even if they could be converted to integer instead:
>>> json.loads('123', parse_int=float)
123.0
or refuse to convert JSON's representations of special floating-point values:
>>> def reject_special_floats(value):
... raise ValueError
...
>>> json.loads('Infinity')
inf
>>> json.loads('Infinity', parse_constant=reject_special_floats)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/lib/python3.8/json/__init__.py", line 370, in loads
return cls(**kw).decode(s)
File "/usr/lib/python3.8/json/decoder.py", line 337, in decode
obj, end = self.raw_decode(s, idx=_w(s, 0).end())
File "/usr/lib/python3.8/json/decoder.py", line 353, in raw_decode
obj, end = self.scan_once(s, idx)
File "<stdin>", line 2, in reject_special_floats
ValueError
Customization example using object_hook and object_pairs_hook
object_hook and object_pairs_hook can be used to control what the parser does when given a JSON object, rather than creating a Python dict.
A supplied object_pairs_hook will be called with one argument, which is a list of the key-value pairs that would otherwise be used for the dict. It should return the desired dict or other result:
>>> def process_object_pairs(items):
... return {k: f'processed {v}' for k, v in items}
...
>>> json.loads('{"one": 1, "two": 2}', object_pairs_hook=process_object_pairs)
{'one': 'processed 1', 'two': 'processed 2'}
A supplied object_hook will instead be called with the dict that would otherwise be created, and the result will substitute:
>>> def make_items_list(obj):
... return list(obj.items())
...
>>> json.loads('{"one": 1, "two": 2}', object_hook=make_items_list)
[('one', 1), ('two', 2)]
If both are supplied, the object_hook will be ignored and only the object_items_hook will be used.
Text encoding issues and bytes/unicode confusion
JSON is fundamentally a text format. Input data should be converted from raw bytes to text first, using an appropriate encoding, before the file is parsed.
In 3.x, loading from a bytes object is supported, and will implicitly use UTF-8 encoding:
>>> json.loads('"text"')
'text'
>>> json.loads(b'"text"')
'text'
>>> json.loads('"\xff"') # Unicode code point 255
'ÿ'
>>> json.loads(b'"\xff"') # Not valid UTF-8 encoded data!
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/lib/python3.8/json/__init__.py", line 343, in loads
s = s.decode(detect_encoding(s), 'surrogatepass')
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xff in position 1: invalid start byte
UTF-8 is generally considered the default for JSON. While the original specification, ECMA-404 does not mandate an encoding (it only describes "JSON text", rather than JSON files or documents), RFC 8259 demands:
JSON text exchanged between systems that are not part of a closed ecosystem MUST be encoded using UTF-8 [RFC3629].
In such a "closed ecosystem" (i.e. for local documents that are encoded differently and will not be shared publicly), explicitly apply the appropriate encoding first:
>>> json.loads(b'"\xff"'.decode('iso-8859-1'))
'ÿ'
Similarly, JSON files should be opened in text mode, not binary mode. If the file uses a different encoding, simply specify that when opening it:
with open('example.json', encoding='iso-8859-1') as f:
print(json.load(f))
In 2.x, strings and byte-sequences were not properly distinguished, which resulted in a lot of problems and confusion particularly when working with JSON.
Actively maintained 2.x codebases (please note that 2.x itself has not been maintained since Jan 1, 2020) should consistently use unicode values to represent text and str values to represent raw data (str is an alias for bytes in 2.x), and accept that the repr of unicode values will have a u prefix (after all, the code should be concerned with what the value actually is, not what it looks like at the REPL).
Historical note: simplejson
simplejson is simply the standard library json module, but maintained and developed externally. It was originally created before JSON support was added to the Python standard library. In 2.6, the simplejson project was incorporated into the standard library as json. Current development maintains compatibility back to 2.5, although there is also an unmaintained, legacy branch that should support as far back as 2.2.
The standard library generally uses quite old versions of the package; for example, my 3.8.10 installation reports
>>> json.__version__
'2.0.9'
whereas the most recent release (as of this writing) is 3.18.1. (The tagged releases in the Github repository only go as far back as 3.8.2; the 2.0.9 release dates to 2009.
I have as yet been unable to find comprehensive documentation of which simplejson versions correspond to which Python releases.

Trying to parse JSON string, unexpected number

I am trying to parse this JSON:
var json = '{"material":"Gummislang 3\/4\" 30 m (utanp\u00e5liggande sk\u00e5p)"}'
I run JSON.parse(json) but i get the error SyntaxError: Unexpected number when doing so. I have tried this in Google Chrome. I don't know what the problem is since I can take the JSON string and put it in any JSON validator and it claims that the JSON is valid. Shouldn't the browser be able to parse it?
You are inserting a JSON object representation into a JavaScript string without properly escaping the representation.
To avoid having to do this, remove the quotes you are adding around the representation, and skip the JSON.parse(json) – the default output from PHP's json_encode() is valid JavaScript when used in this context.
For security, you should specify the JSON_HEX_TAG option if possible. This will prevent cross-site scripting in cases where the JSON might end up inside a document parsed as XML. (And for XML documents, the JSON should be inside a CDATA section as well.)
You're validating the string literal, which is a valid JSON string containing invalid JSON. You need to validate the value of the string, which is not valid JSON.
If you paste the string value into a JSON validator, you'll see that the error comes from this part:
"material": "Gummislang 3/4"30m
The " needs to be escaped.

Using javascript objects with UTF-16 property names

I'm calling a service that returns UTF-16 json data.
My question is if the JSON object has UTF-16 strings as property names is there a simple way to reference these properties?
For example, here is how the response data looks like after calling JSON.stringify on it:
"{"C\u0000o\u0000n\u0000t\u0000e\u0000n\u0000t\u0000s\u0000":{ ...
In my code I'd like to do something like data['Contents']. Is there a simple way around this that avoids either hardcoding the strings with unicode escape sequences?
Update: changed to indicate strings are UTF-16.
Here's an example (Visual C++) of the call to generate the JSON output:
wchar_t* str = _T("Contents");
yajl_gen_string(g, (unsigned char*)str, wcslen(str) * sizeof(TCHAR));

In What Standard is it Made Official That JSON Object Property Names Must Be Double-Quoted?

Checked here:
http://www.json.org/
and here:
http://www.ietf.org/rfc/rfc4627.txt?number=4627
All I'm seeing is that names must be strings, not that they can't use single-quotes. Don't get me wrong, I'm on board with quoted names for JSON. It protects devs from using property names that aren't legit JS variable names and also powerful use of JSON frequently puts traditional values in property-names for things like map-reduction of 2D arrays modeling tables.
I also think it would make sense for all names to consistently use one or the other quote-type in order to avoid assumptions one might make while trying to parse JSON in some language that doesn't have convenient JSON parsing libraries/native-methods coming out of its pores, but I don't see anything in these specs that insists it must be double or single.
All I see in the second link is that they must be strings. Where is it established that they must be double-quoted as a lot of the JSON validators seem to think? Is there another source? If so who own JSON town? I'm feeling like a raggedy-man who lost his way.
From page 4 of the RFC that you link to:
string = quotation-mark *char quotation-mark
...
quotation-mark = %x22 ; "
Property names must be strings, but strings must be delimited with quotation marks (not apostrophes).
See also the diagram of a string on json.org. Note that it starts and ends with " and not branches that would allow ' as an alternative.
Taken from the http://www.json.org site:
Strings are defined as a character sequence enclosed in double quotes.
In the RFC:
string = quotation-mark *char quotation-mark
where
quotation-mark = %x22 ; "

Categories

Resources