Error parsing JSON with escaped quotes inside single quotes - javascript

I have a variable var jsonData = '{"Key":"query","Value":"dept=\"Human Resources*\"","ValueType":"Edm.String"}';
I'm trying to parse the variable with JSON.parse(jsonData), however, I'm getting an error "Unexpected token H in JSON at position 30." I can't change how the variable is returned, so here's what I think I understand about the problem:
The JSON.parse(jsonData) errors out because it's not recognizing the escaped double quotes as escaped since it is fully enclosed in single quotes
jsonData.replace(/\\"/g, "\\\\"") or other combinations that I've tried aren't finding the \" because javascript treats \" as just "
QUESTION How can I parse this properly, by either replacing the escaped quotes with something JSON.parse() can handle or using something else to parse this correctly? I'd like to stick with JSON.parse() on account of it's simplicity, but open to other options.
EDIT: Unfortunately I can't change the variable at this stage, it is just a small example of a larger JSON response. This is a temporary solution until the app is granted access to the API, but I needed the solution in the interim until that happens (IT dept can be slow). What I'm doing now its getting a large JSON response back by hitting the API address directly and the browser uses the cookies from the user OAuth for authentication. I then copy and paste the JSON response into my application so I can work with the data. The response is riddled with the escaped quotes and manually editing the text would be laborious and I'm trying to avoid copying into text processor before copying into the variable.

You should escape the backslash character in your code by prefixing it with another backslash. So the code becomes:
var jsonData = '{"Key":"query","Value":"dept=\\"Human Resources*\\"","ValueType":"Edm.String"}';
The first backslash is so that JS puts the second backslash in the string, which must be in the string so that the json parser knows that it should ignore the quote character.

The unfortunate thing about this situation is that in the JavaScript code there is no difference between
var jsonData = '{"Key":"query","Value":"dept=\"Human Resources*\"","ValueType":"Edm.String"}'
and
var jsonData = '{"Key":"query","Value":"dept="Human Resources*"","ValueType":"Edm.String"}'.
You could hardcode information you have about the JSON into the way you program it. For example, you could replace occurences of the regex ([\[\{,:]\s+)\" by $1\" but this would fail to work if the string Human Resources* could also end in a :, { or ,. This would also potentially cause security issues.
In my opinion, the best way to solve your problem would be to put the json response in a json file somewhere so that it can be read into a string by the javascript code that needs to use it.

I think you can also dispense with the initial String to represent the JSON object:
Use a standard JSON object.
Make whatever changes you need on that object.
Call JSON.stringify(YOUR_OBJECT) for a String representation.
Then, JSON.parse(…) when you need an object again.
That should be able to satisfy your initial request, question, keep your current (escaped) String values, and give you some room to make a lot of changes.
To escape your current String value:
obj["Value"] = 'dept=\"Human Resources*\"'
Alternatively, you can nest attributes:
obj["Value"]["dept"] = "Human Resources*"
Which may be helpful for other reasons.
I've found that I've rarely worked with JSON in an enterprise or production environment where the above sequence wasn't used (I've never used a purely string representation in a production environment) simply due to the ease of modifying attributes, generating dynamic data/modifying the JSON object, and actually using the JSON programmatically.
Using string representations for what are really attribute key-value pairings often causes headaches later on (for example, when you want to read the Human Resources* value programmatically and use it).
I hope you find that approach helpful!

Related

Javascript RegExp being interpreted different from a string vs from a data-attribute

Long story short, I'm trying to "fix" my system so I'm using the same regular expressions on the backend as we are the front (validating both sides for obvious security reasons). I've got my regex server side working just fine, but getting it down to the client is a pain. My quickest thought was to simply store it in a data attribute on a tag, grab it, and then validate against it.
Well, me, think again! JS is throwing me for a loop because apparently RegExp interprets the string differently depending how it's pulled in. Can anyone shine some light on what is happening here or how I might go about resolving this issue
HTML
<span data-regex="(^\\d{5}$)|(^\\d{5}-\\d{4}$)"></span>
Javascript
new RegExp($0.dataset.regex)
//returns /(^\\d{5}$)|(^\\d{5}-\\d{4}$)/
new RegExp($($0).data('regex'))
//returns /(^\\d{5}$)|(^\\d{5}-\\d{4}$)/
new RegExp("(^\\d{5}$)|(^\\d{5}-\\d{4}$)");
//returns /(^\d{5}$)|(^\d{5}-\d{4}$)/
Note in the first two how if I pull the value from the data attribute dynamically, the constructor for RegExp for some reason doesn't interpret the double slash correctly. If, however, I copy and paste the value as a string and call RegExp on the value, it correctly interprets the double slash and returns it in the right pattern.
I've also attempted simply not escaping the \d character by double slashing on the server side, but as you might (or might not) have guessed, the opposite happens. When pulled from attributes/dataset, the \ is completely removed leading the Regex to think I'm looking for the "d" character rather than digits. I'm at a loss for understanding what JS is thinking here. Please send help, Internet
Your data attribute has redundant backslashes. There's no need to escape backslashes in HTML attributes, so you'll actually get a double-backslash where you don't want one. When writing regular expressions as strings in JavaScript you have to escape backslashes, of course.
So you don't actually have the same string on both sides, simply because escaping works differently.

HTML in JSON that "should" not be there

I have description field in a form.
As suggested here, HTML escaping should not be done in input, so if you put <h1>Description</h1> it is saved like this to database.
The problem is that I have defined a REST API, and the output "could" be HTML.
Should I escape the field when constructing the JSON or should I output HTML in JSON and let the client escape it?.
I feel I should escape the HTML server side, but then this operation would cost processing time. On the other hand, escaping in HTML saves this server time, but people using the API not carefully escaping HTML could end with XSS attacks.
A client may, probably will, be a Javascript client which should process such potential HTML values using the DOM API:
document.getElementById('output').textContent = json.result;
Using this DOM API is perfectly safe and does not require to escape json.result, since it's never interpolated as HTML, but treated as text node by a higher level API. If you send escaped HTML and the client is doing it properly like here, then escaped HTML will be shown on the client; i.e. you're turning your data into garbage.
So, no, never escape values for unrelated contexts. Escape/encode for JSON when putting values into JSON, don't worry about what may or may not happen later.

Why is my JSON returning escape code for a single quote (apostrophe)

I'm receiving JSON data from server that contains text which should have an apostrophe but instead I see the escape code for an apostrophe. Is this an issue with the way the JSON is formatted?
This is how I have it on server-side:
[{"testJ":6387,"title":"This is JSON's return",}]
This is what I'm getting back:
[{"testJ":6387,"title":"This is JSON's return",}]
If I have not provided enough detail, please let me know and I will try to add more information.
Your JSON is almost valid, but you have a problem, you have add one comma that shouldn't be there. (the last comma).
You can check this using a JSON validator site like
http://www.freeformatter.com/json-validator.html
http://jsonformatter.curiousconcept.com/
http://jsonlint.com/
On the other hand, think that the apostrophe is a way to enclose text, so what you are using to parse the JSON is what is having the problem. Try to put an escape character before the apostrophe, so should be like this on the server side
[{"testJ":6387,"title":"This is JSON\u0027s return"}]
For more information you can refer to the RFC https://www.ietf.org/rfc/rfc4627.txt and in section 2.5 you will find more information.

using regexp on raw binary data

I'm embedding JavaScript in my C++ app (via V8) and I get some raw binary data which I want to pass to JavaScript. Now, in the JavaScript, I plan to do some regular expressions on the data.
When using just the standard JavaScript String object for my data, everything is quite straight-forward. However, as far as I understand it, it uses an UTF16 representation and expects the data to be valid Unicode. But I have arbitrary data (might contain '\0' and other raw data - although it is just text for the most part).
How should I handle this? I searched a bit around and maybe ArrayBuffer or something like this is the object I need to store my raw data. However, I didn't found how to do the usual regular expression methods on that object. (Basically I need RegExp.test and RegExp.exec).
I just checked out the Node.js code and it seems as if they support binary data and just put it into a string via v8::String::NewFromOneByte. See here and here. So that would answer my question (i.e., I can just use String), wouldn't it? Any downsides?
(I still don't see why my question is bad. Please explain the downvote.)
From all my current tests, it seems like it works just as expected with normal String.
You can even specify that in JavaScript directly, e.g.
var s = "\x00\x01\x02\x03"
and regular expressions on that string work like expected.
On the C++ side, if you want to get your binary data into a JS String object:
v8::Local<v8::String> jsBinary(const uint8_t* data, uint32_t len) {
assert(int(len) >= 0);
return String::NewFromOneByte(v8::Isolate::GetCurrent(), data, String::kNormalString, len);
}

Unexpected token error while json parsing for string containing special characters

On trying to parse the following string on titanium Studio for mobile app project, I get the
error:
Unexpected token at profileSkills":"Analysis
des='[{"jobId":0,"jobPositionName":"NA","companyId":0,"companyDisplayName":"NA","profileSkills":"Analysis\r\nAnalysis\r\nQuality Assurance\r\nProject Management\r\nProgrammer Analyst\r\n"}]';
desjson=JSON.parse(des);
Can anyone help me , whether I can parse strings containing escape charaters using JSON.
If not, could you tell me the procedure to it.
You need to encode the special characters with double-backslashes, because the JSON parser will expect them to be escaped.
var des='[{"jobId":0,"jobPositionName":"NA","companyId":0,"companyDisplayName":"NA","profileSkills":"Analysis\\r\\nAnalysis\\r\\nQuality Assurance\\r\\nProject Management\\r\\nProgrammer Analyst\\r\\n"}]';
If you are actually declaring the JSON string as a JavaScript string literal, then you have to account for the fact that when the JavaScript parser sees those escaped characters, it'll build a string with the real carriage return and line feed characters. The JSON parser coming along after that won't like them.
If, on the other hand, your JSON is really coming from a server, then the JSON "on the wire" should not have doubled backslashes.
I should also note that there's rarely any reason to put a JSON string as a literal in JavaScript code. It might as well be a JavaScript object literal, in most cases. (I acknowledge that there might be some reason for it, of course.)
You have two \r\ in the string, that should be \r\n. Change those, and it validates as correct JSON.

Categories

Resources