How to avoid parsing "\" in JSON.parse () method - javascript

I'm trying to parse JSON to JS object, but i have problem with one property, which in value always contains "\" character and four characters after. E.g. string looks something like that:
"key": "Z13g\u003d"
Once I parse it i get:
"key": "Z13g="
Is there any easy way to solve this problem?

If you have a string like "\u003d" in JavaScript, it's indistinguishable from its parsed string "=". Even the String.replace function won't find the \ character in the string.
However, if you are truly trying to represent a string that includes the backslash character, you need to escape it with another backslash.
Whereas "\u003d" represents the string value "=", "\\u003d" represents the string value "\u003d".
However, things get more complicated when you invoke JSON.parse; since it's parsing the string value again, it'll transform "\\u003d" to "=".
To get around this, you need to double-escape the backslash, so you'll have a string value of "\\\\u003d". The parser will transform that into "\u003d" instead of "=".
console.log(JSON.parse("\"\u003d\"")); // "\u003d" -> "="
console.log(JSON.parse("\"\\u003d\"")); // "\\u003d" -> "="
console.log(JSON.parse("\"\\\\u003d\"")); // "\\\\u003d" -> "\u003d"

Related

What was the function to decode special characters like \n to their representatives?

Basically user provides input for my script using shell arguments, an example of user input is like this:
Kek kek\nkek\tkek\x43
Upon receiving the input, javascript shows that my parameter is basically defined like this:
var parameter="Kek kek\\nkek\\tkek\\x43";
The above string basically has slashes escaped, instead of converting them into the needed characters.
So what I'm trying to do is to convert my parameter variable into a desired_outcome variable, which would look like this:
var desired_outcome="Kek kek\nkek\tkek\x43";
Subsequently, if printed out the result should output:
Kek kek
kek kekC
So what was the right function to convert one into another?
Finally, I was able to solve the problem, as according to the specification:
https://en.wikipedia.org/wiki/Escape_character
const decode_string = input => "nrtbfv0".split('')
.map((x,i)=>[new RegExp(`\\\\${x}`,'g'),"\n\r\t\b\f\v\0".split('')[i]])
.concat([[/\\[xX]([0-9a-fA-F]{2})/g,(_,hh)=>String.fromCharCode(parseInt(hh,16))]])
.reduce((out,m)=>out.replace(m[0],m[1]),input);
Explanation:
I have built a 2D array with regex and corresponding replacements. The dynamically built regex contain an actual slash character \ + one of the letters where the replacements correspond to special characters. Next I've added a regex for the ASCII hexadecimal characters, where the replacement is a function that uses String.fromCharCode() conversion. Finally, it reduces the input, running all of them together.
Usage:
var parameter="Kek kek\\nkek\\tkek\\x43";
var desired_outcome = decode_string(parameter);
Notes:
Double slashes don't need to be decoded because they are already escaped.
Quotes, both double and single, don't have to be escaped either, because they are being escaped on a shell level.

How to split a string in javascript by special char \

I believe that this is simple and I'm missing something. I want to split a physical path in windows with javascript. So I try with String#split function, but my result was inespected.
For this string
"C:\CLC\VIDA\Web\_REPOSITORIO\Colectivos\ReembolsosWeb\TMP_011906169_01_01.pdf"
I'm getting this result
var test = "C:\CLC\VIDA\Web\_REPOSITORIO\Colectivos\ReembolsosWeb\TMP_011906169_01_01.pdf";
test.split("\"); //throws error
test.split("\\"); //result in -> ["C:CLCVIDAWeb_REPOSITORIOColectivosReembolsosWebTMP_011906169_01_01.pdf"]
test.split(/\\/); // -> the regex is the same as above
One last thing, in my test, I found that to get the result that I want I could do it like this
var test2 = "C:\\CLC\\VIDA\\Web\\_REPOSITORIO\\Colectivos\\ReembolsosWeb\\TMP_011906169_01_01.pdf"
test2.split("\\"); // -> ["C:", "CLC", "VIDA", "Web", "_REPOSITORIO", "Colectivos", "ReembolsosWeb", "TMP_011906169_01_01.pdf"]
So my question is, how can I split the string from test var to get the array from the last case?
Strings in javascript support escape sequences via the backslash (\). For example if you need a tab in your string you can add a \t anywhere in your string and it will be replaced with a tab, a \n will be replaced with a new line.
The backslashes in test are either converted to their respective characters or dropped because they are invalid escape sequences.
To get around this you can escape one backslash with another to get a single normal backslash. The downside is that this cannot be done in javascript. Generally I paste my string in to notepad/N++/Code/Sublime and replace all \ with \\
Since you are hard coding the string you need to escape all backslashes. After that you can use test.split("\\") which, itself contains an escaped backslash.
So, as far as Javascript is concerned, your code looks like this.
var test = "C:CLCVIDAWeb_REPOSITORIOColectivosReembolsosWebTMP_011906169_01_01.pdf";
To make javascript see the string correctly you need to make it look like this...
var test = "C:\\CLC\\VIDA\\Web\\_REPOSITORIO\\Colectivos\\ReembolsosWeb\\TMP_011906169_01_01.pdf";
Firstly, note that when you have a single backslash in a string, it is used for escaping the next character. It is just ignored if there is no special character next to it to escape.
Now, just have a look at your string :
var test = "C:\CLC\VIDA\Web\_REPOSITORIO\Colectivos\ReembolsosWeb\TMP_011906169_01_01.pdf"
Don't you think all of your single backslashes will be ignored here?
So, the solution is simple, what you have already tried successfully. To escape all your backslashes with another backslash.
var test2 = "C:\\CLC\\VIDA\\Web\\_REPOSITORIO\\Colectivos\\ReembolsosWeb\\TMP_011906169_01_01.pdf"
test2.split("\\"); // -> ["C:", "CLC", "VIDA", "Web", "_REPOSITORIO", "Colectivos", "ReembolsosWeb", "TMP_011906169_01_01.pdf"]
But, are you worried about any dynamic data which has such backslash? (For example, coming from a text input or a file input.) Don't think about escaping the backslash inside it. Because you don't need to do that! It's already a well formatted string for you, which you can use as it is. You need to escape only when you are hard coding the string yourself.

Javascript \x escaping

I've seen a few other programs that have something like this:
var string = '\x32\x20\x60\x78\x6e\x7a\x9c\x89';
And I had to try to fiddle with the numbers and letters, to find the text I wanted to display.
I'm wondering if there is a function to find the \x escape of a string, like string.toUpperCase() in JS. I'm using processingJS, but it will be okay for me to use other programming languages to find the ASCII for \x.
If you have a string that you want escaped, you can use String.prototype.charCodeAt()
If you have the code with escapes, you can just evaluate them to get the original string. If it's a string with literal escapes, you can use String.fromCharCode()
If you have '\x32\x20\x60\x78\x6e\x7a\x9c\x89' and want "2 `xnz" then
'\x32\x20\x60\x78\x6e\x7a\x9c\x89' == "2 `xnz"
If you have '\\x32\\x20\\x60\\x78\\x6e\\x7a\\x9c\\x89' which is a literal string with the value \x32\x20\x60\x78\x6e\x7a\x9c\x89 then you can parse it by passing the decimal value of each pair of hex digits to String.prototype.fromCharCode()
'\\x32\\x20\\x60\\x78\\x6e\\x7a\\x9c\\x89'.replace(/\\x([0-9a-f]{2})/ig, function(_, pair) {
return String.fromCharCode(parseInt(pair, 16));
})
Alternatively, eval is an option if you can be sure of the safety of the input and performance isn't important1.
eval('"\\x32\\x20\\x60\\x78\\x6e\\x7a\\x9c\\x89"')
Note the " nested in the ' surrounding the input string.
If you know it's a program, and it's from a trusted source, you can eval the string directly, which won't give you the ASCII, but will execute the program itself.
eval('\\x32\\x20\\x60\\x78\\x6e\\x7a\\x9c\\x89')
Note that the input you provided is not a program and the eval call fails.
If you have "2 `xnz" and want '\x32\x20\x60\x78\x6e\x7a\x9c\x89' then
"2 `xnz".split('').map(function(e) {
return '\\x' + e.charCodeAt(0).toString(16);
}).join('')

JSON.parse failing on valid Json. Have escaped control characters.If

I've escaped control characters and am feeding my validated JSON into JSON.parse and jQuery.parseJSON. Both are giving the same result.
Getting error message "Unexpected token $":
$(function(){
try{
$.parseJSON('"\\\\\"$\\\\\"#,##0"');
} catch (exception) {
alert(exception.message);
}
});
<script src="https://ajax.googleapis.com/ajax/libs/jquery/1.8.3/jquery.min.js"></script>
Thanks for checking out this issue.
What's happening here is that there are two levels of backslash removal being applied to the string. The first is done by the browser's JavaScript engine when it parses the single-quoted string. In JavaScript, single-quoted strings and double-quoted strings are exactly equivalent (other than the fact that single-quotes must be backslash-escaped in single-quoted strings and double-quotes must be backslash-escaped in double-quoted strings); both types of strings take backslash escape codes such as \\ for backslash, \' for single-quote (redundant but accepted in double-quoted strings), and \" for double-quote (redundant but accepted in single-quoted strings).
In your JavaScript single-quoted string literal you have several instances of this kind of thing, which are meant to be valid JSON double-quoted strings:
"\\\\\"$\\\\\"#,##0"
After the browser has parsed it, the string contains exactly the following characters (including the outer double-quotes, which are unremoved because they are contained in a single-quoted string):
"\\"$\\"#,##0"
You can see that each consecutive pair of backslashes became a single literal backslash, and the two cases of an odd backslash followed by a double-quote each became a literal double-quote.
That is the text that is being passed as an argument to $.parseJSON, which is when the second level of backslash removal occurs. During JSON parsing of the above text, the leading double-quote signifies the start of a JSON string literal, then the pair of backslashes is interpreted as a single literal backslash, and then the immediately following double-quote terminates the JSON string literal. The stuff that follows (dollar, backslash, backslash, etc.) is invalid JSON syntax.
The problem is that you've embedded valid JSON in a JavaScript single-quoted string literal, which, although it happens to be valid JavaScript syntax by fluke (it wouldn't have been if the JSON contained single-quotes, or if you'd tried using double-quotes to delimit the JavaScript string literal), no longer contains valid JSON after being parsed by the browser's JavaScript engine.
To solve the problem, you have to either manually escape the JSON content to be properly embedded in a JavaScript string literal, or load it independently of the JavaScript source, e.g. from a flat file.
Here's a demonstration of how to solve the problem using your latest example code:
$(function() {
try {
alert($.parseJSON('{"key":"\\\\\\\\\\"$\\\\\\\\\\"#,##0"}').key); // works
alert($.parseJSON('{"key":"\\\\\"$\\\\\"#,##0"}').key); // doesn't work
} catch (exception) {
alert(exception.message);
}
});
http://jsfiddle.net/814uw638/2/
Since JavaScript has a simple escaping scheme (e.g. see http://blogs.learnnowonline.com/2012/07/19/escape-sequences-in-string-literals-using-javascript/), it's actually pretty easy to solve this problem in the general case. You just have to decide in advance how you're going to quote the string in JavaScript (single-quotes are a good idea, because strings in JSON are always double-quoted), and then when you prepare the JavaScript source, just add a backslash before every single-quote and every backslash in the embedded JSON. That should guarantee it will be perfectly valid, regardless of the exact JSON content (provided, of course, that it is valid JSON to begin with).
In your original problem, why do you need to do JSONparse in the first place? You could have easily gotten the object you wanted by just doing
var o = { blah }
by manually removing the single quotes you have around the curly braces rather than doing
$.JSONparse('{blah}')
Is there any reason for evaluating the string first (ie var s = '{blah}' and then doing $.JSONparse(s)) which is what your original code was doing? There shouldn't be a case where this is necessary. Since you mentioned somewhere that the string was produced by JSON.stringify, there shouldn't be a scenario where you need to explicitly store it into a variable (ie copy and paste it and put quotes around it).
The main problem here is the string produced by JSON.stringify, which is properly escaped, has been 'evaluated' once when you manually put braces around it. So the key is to make sure the string doesn't get 'evaluated'
Even if you wanted to pass the stringified variable to database or anything, there is no need to explicitly use quotes. One could do
var s = JSON.stringify(obj);
db.save("myobj",s)
var newObj = JSON.parse(db.load("myobj"))
The string is stored verbatim without getting evaluated, so that when you retrieve it, you would have the exact same string.

Parsing malformed JSON in JavaScript

Thanks for looking!
BACKGROUND
I am writing some front-end code that consumes a JSON service which is returning malformed JSON. Specifically, the keys are not surrounded with quotes:
{foo: "bar"}
I have NO CONTROL over the service, so I am correcting this like so:
var scrubbedJson = dirtyJson.replace(/(['"])?([a-zA-Z0-9_]+)(['"])?:/g, '"$2": ');
This gives me well formed JSON:
{"foo": "bar"}
Problem
However, when I call JSON.parse(scrubbedJson), I still get an error. I suspect it may be because the entire JSON string is surrounded in double quotes but I am not sure.
UPDATE
This has been solved--the above code works fine. I had a rogue single quote in the body of the JSON that was returned. I got that out of there and everything now parses. Thanks.
Any help would be appreciated.
You can avoid using a regexp altogether and still output a JavaScript object from a malformed JSON string (keys without quotes, single quotes, etc), using this simple trick:
var jsonify = (function(div){
return function(json){
div.setAttribute('onclick', 'this.__json__ = ' + json);
div.click();
return div.__json__;
}
})(document.createElement('div'));
// Let's say you had a string like '{ one: 1 }' (malformed, a key without quotes)
// jsonify('{ one: 1 }') will output a good ol' JS object ;)
Here's a demo: http://codepen.io/csuwldcat/pen/dfzsu (open your console)
something like this may help to repair the json ..
$str='{foo:"bar"}';
echo preg_replace('/({)([a-zA-Z0-9]+)(:)/','$1"$2"${3}',$str);
Output:
{"foo":"bar"}
EDIT:
var str='{foo:"bar"}';
str.replace(/({)([a-zA-Z0-9]+)(:)/,'$1"$2"$3')
There is a project that takes care of all kinds of invalid cases in JSON https://github.com/freethenation/durable-json-lint
I was trying to solve the same problem using a regEx in Javascript. I have an app written for Node.js to parse incoming JSON, but wanted a "relaxed" version of the parser (see following comments), since it is inconvenient to put quotes around every key (name). Here is my solution:
var objKeysRegex = /({|,)(?:\s*)(?:')?([A-Za-z_$\.][A-Za-z0-9_ \-\.$]*)(?:')?(?:\s*):/g;// look for object names
var newQuotedKeysString = originalString.replace(objKeysRegex, "$1\"$2\":");// all object names should be double quoted
var newObject = JSON.parse(newQuotedKeysString);
Here's a breakdown of the regEx:
({|,) looks for the beginning of the object, a { for flat objects or , for embedded objects.
(?:\s*) finds but does not remember white space
(?:')? finds but does not remember a single quote (to be replaced by a double quote later). There will be either zero or one of these.
([A-Za-z_$\.][A-Za-z0-9_ \-\.$]*) is the name (or key). Starts with any letter, underscore, $, or dot, followed by zero or more alpha-numeric characters or underscores or dashes or dots or $.
the last character : is what delimits the name of the object from the value.
Now we can use replace() with some dressing to get our newly quoted keys:
originalString.replace(objKeysRegex, "$1\"$2\":")
where the $1 is either { or , depending on whether the object was embedded in another object. \" adds a double quote. $2 is the name. \" another double quote. and finally : finishes it off.
Test it out with
{keyOne: "value1", $keyTwo: "value 2", key-3:{key4:18.34}}
output:
{"keyOne": "value1","$keyTwo": "value 2","key-3":{"key4":18.34}}
Some comments:
I have not tested this method for speed, but from what I gather by reading some of these entries is that using a regex is faster than eval()
For my application, I'm limiting the characters that names are allowed to have with ([A-Za-z_$\.][A-Za-z0-9_ \-\.$]*) for my 'relaxed' version JSON parser. If you wanted to allow more characters in names (you can do that and still be valid), you could instead use ([^'":]+) to mean anything other than double or single quotes or a colon. You can have all sorts of stuff in here with this expression, so be careful.
One shortcoming is that this method actually changes the original incoming data (but I think that's what you wanted?). You could program around that to mitigate this issue - depends on your needs and resources available.
Hope this helps.
-John L.
How about?
function fixJson(json) {
var tempString, tempJson, output;
tempString = JSON.stringify(json);
tempJson = JSON.parse(tempString);
output = JSON.stringify(tempJson);
return output;
}

Categories

Resources