I'm attempting to serialize a string that contains escaped strings into JSON. I would have imagined that JSON.stringify() would correctly re-escape those strings and allow me to JSON.parse it. In a simple case, for example:
JSON.parse(JSON.stringify("\\"))
The output from node is "\". The output from the browser is "\" - it seems the browser (chrome in my case) is not correctly converting the double backslash \\ into \\\\.
Why is that?
When you write code, you have to write "\\" (because backslash self is used as escaping), which is a string contains only one backslash ("\\".length is 1).
But when displayed in console or browser, it will displayed as "\".
Related
I have an input JSON like this (which really contains the literal values "\u2013" (the encoded form of a unicode character)):
{"source":"Subject: NEED: 11/5 BNA-MSL \u2013 1200L Departure - 1 Pax"}
I read it with JSON.parse and it reads the \u2013 as –, which is fine for display in my app.
However, I need to export again the same JSON, to send it down to some other app. I want to keep the same format and have back the \u2013 into the JSON. I am doing JSON.stringify, but it keeps the – in the output.
Any idea what I could do to keep the \u syntax?
Using a replacer function in a JSON.stringify call didn't work - strings returned from the replacer with an escaped backslash produce a double backslash in output, and a single backslashed character is unescaped in output if possible.
Simply re-escaping the stringify result has potential:
const obj = {"source":"Subject: NEED: 11/5 BNA-MSL \u2013 1200L Departure - 1 Pax"}
console.log(" stringify: ", JSON.stringify( obj));
console.log("& replaceAll: ", JSON.stringify(obj).replaceAll('\u2013', '\\u2013'));
using more complex string modifications as necessary.
However this looks very like an X solution to an X-Y problem. Better might be to fix the downstream parsing to handle JSON text as JSON text and not try to use it in raw form - particularly given that JSON text in encoded in utf-8 and can handle non-ASCII characters without special treatment.
I'm running a NodeJS app that gets certain posts from an API.
When trying to JSON.parse with special characters in, the JSON.parse would fail.
Special characters can be just any other language, emojis etc.
Parsing works fine when posts don't have special characters.
I need to preserve all of the text, I can't just ignore those characters since I need to handle every possible language.
I'm getting the following error:
"Unexpected token �"
Example of a text i'm supposed to be able to handle:
"summary": "★リプライは殆ど見てません★ Tokyo-based E-J translator. ここは流れてくるニュースの自分用記録でRT&メモと他人の言葉の引用、ブログのフィード。ここで意見を述べることはしません。「交流」もしません。関心領域は匦"�アイルランドと英国(他は専門外)※Togetterコメ欄と陰謀論が嫌いです。"
How can I properly parse such a text?
Thanks
You have misdiagnosed your problem, it has nothing to do with that character.
Your code contains an unescaped " immediately before the special character you think is causing the problem. The early " is prematurely terminating the string.
If you insert a backslash to escape the ", your string can be parsed as JSON just fine:
x = '{"summary": "★リプライは殆ど見てません★ Tokyo-based E-J translator. ここは流れてくるニュースの自分用記録でRT&メモと他人の言葉の引用、ブログのフィード。ここで意見を述べることはしません。「交流」もしません。関心領域は匦\\"�アイルランドと英国(他は専門外)※Togetterコメ欄と陰謀論が嫌いです。"}';
console.log(JSON.parse(x));
You need to pass a string not as an object.
Example
JSON.parse('{"summary" : "a"}');
In your case it should be like this
JSON.parse(
'{"summary" : "★リプライは殆ど見てません★ Tokyo-based E-J translator. ここは流れてくるニュースの自分用記録でRT&メモと他人の言葉の引用、ブログのフィード。ここで意見を述べることはしません。「交流」もしません。関心領域は匦�アイルランドと英国(他は専門外)※Togetterコメ欄と陰謀論が嫌いです。"}')
I have a JSON string which contains an escaped Unicode character. The JSON includes this snippet:
I co-ordinate our Chat Literacy network \u2013 an online group for practitioners of Information Literacy
The \u2013 is a long dash.
I'm using
var theObject = eval ("(" + jsonString + ")");
to convert the JSON string to a JavaScript object. I need to use a version of SpiderMonkey that doesn't have a direct JSON to Object method in it.
After conversion, the character in question becomes the Unicode control character \0013 which is an invalid UTF-8 character.
Is there another way I can convert the JSON to an object which will preserve the correct long-dash character? Maybe some other JSON to Object method I can load?
This happens with some other characters also, like curly quotes.
Thanks,
Doug
eval() is evil. Stay away from it.
Try using JSON 3: http://bestiejs.github.io/json3/
I've escaped control characters and am feeding my validated JSON into JSON.parse and jQuery.parseJSON. Both are giving the same result.
Getting error message "Unexpected token $":
$(function(){
try{
$.parseJSON('"\\\\\"$\\\\\"#,##0"');
} catch (exception) {
alert(exception.message);
}
});
<script src="https://ajax.googleapis.com/ajax/libs/jquery/1.8.3/jquery.min.js"></script>
Thanks for checking out this issue.
What's happening here is that there are two levels of backslash removal being applied to the string. The first is done by the browser's JavaScript engine when it parses the single-quoted string. In JavaScript, single-quoted strings and double-quoted strings are exactly equivalent (other than the fact that single-quotes must be backslash-escaped in single-quoted strings and double-quotes must be backslash-escaped in double-quoted strings); both types of strings take backslash escape codes such as \\ for backslash, \' for single-quote (redundant but accepted in double-quoted strings), and \" for double-quote (redundant but accepted in single-quoted strings).
In your JavaScript single-quoted string literal you have several instances of this kind of thing, which are meant to be valid JSON double-quoted strings:
"\\\\\"$\\\\\"#,##0"
After the browser has parsed it, the string contains exactly the following characters (including the outer double-quotes, which are unremoved because they are contained in a single-quoted string):
"\\"$\\"#,##0"
You can see that each consecutive pair of backslashes became a single literal backslash, and the two cases of an odd backslash followed by a double-quote each became a literal double-quote.
That is the text that is being passed as an argument to $.parseJSON, which is when the second level of backslash removal occurs. During JSON parsing of the above text, the leading double-quote signifies the start of a JSON string literal, then the pair of backslashes is interpreted as a single literal backslash, and then the immediately following double-quote terminates the JSON string literal. The stuff that follows (dollar, backslash, backslash, etc.) is invalid JSON syntax.
The problem is that you've embedded valid JSON in a JavaScript single-quoted string literal, which, although it happens to be valid JavaScript syntax by fluke (it wouldn't have been if the JSON contained single-quotes, or if you'd tried using double-quotes to delimit the JavaScript string literal), no longer contains valid JSON after being parsed by the browser's JavaScript engine.
To solve the problem, you have to either manually escape the JSON content to be properly embedded in a JavaScript string literal, or load it independently of the JavaScript source, e.g. from a flat file.
Here's a demonstration of how to solve the problem using your latest example code:
$(function() {
try {
alert($.parseJSON('{"key":"\\\\\\\\\\"$\\\\\\\\\\"#,##0"}').key); // works
alert($.parseJSON('{"key":"\\\\\"$\\\\\"#,##0"}').key); // doesn't work
} catch (exception) {
alert(exception.message);
}
});
http://jsfiddle.net/814uw638/2/
Since JavaScript has a simple escaping scheme (e.g. see http://blogs.learnnowonline.com/2012/07/19/escape-sequences-in-string-literals-using-javascript/), it's actually pretty easy to solve this problem in the general case. You just have to decide in advance how you're going to quote the string in JavaScript (single-quotes are a good idea, because strings in JSON are always double-quoted), and then when you prepare the JavaScript source, just add a backslash before every single-quote and every backslash in the embedded JSON. That should guarantee it will be perfectly valid, regardless of the exact JSON content (provided, of course, that it is valid JSON to begin with).
In your original problem, why do you need to do JSONparse in the first place? You could have easily gotten the object you wanted by just doing
var o = { blah }
by manually removing the single quotes you have around the curly braces rather than doing
$.JSONparse('{blah}')
Is there any reason for evaluating the string first (ie var s = '{blah}' and then doing $.JSONparse(s)) which is what your original code was doing? There shouldn't be a case where this is necessary. Since you mentioned somewhere that the string was produced by JSON.stringify, there shouldn't be a scenario where you need to explicitly store it into a variable (ie copy and paste it and put quotes around it).
The main problem here is the string produced by JSON.stringify, which is properly escaped, has been 'evaluated' once when you manually put braces around it. So the key is to make sure the string doesn't get 'evaluated'
Even if you wanted to pass the stringified variable to database or anything, there is no need to explicitly use quotes. One could do
var s = JSON.stringify(obj);
db.save("myobj",s)
var newObj = JSON.parse(db.load("myobj"))
The string is stored verbatim without getting evaluated, so that when you retrieve it, you would have the exact same string.
I have this:
JSON.parse('{"130.00000001":{"p_cod":"130.00000001","value":"130.00000001 HDD Upgrade to 2x 250GB HDD 2.5\" SATA2 7200rpm"}}');
JSONLint says it's perfectly valid json. But on execution I have a JSON.parse error.
But, if I change my code to:
JSON.parse('{"130.00000001":{"p_cod":"130.00000001","value":"130.00000001 HDD Upgrade to 2x 250GB HDD 2.5\\" SATA2 7200rpm"}}');
(note the double backslash)
It works, but now JSONLint says invalid json.
Can someone help to understand this behavior?
It's a difference between the wire format, and what you have to write in your code to get the wire format. When you declare this in code you need the double-\ in your literal so the string gets a single backslash (otherwise it will interpret \" as an escape sequence for just declaring a " and put that in your string). If you print out the value of the literal you will see a single backslash.