I am writing server side javascript code in aspx to pull out the 7 columns from CSV files which may or may not use quotes to enclose the data, depending on the file.
Currently, I am working on the code to remove the commas using javascript's string.sllice() for the first column, but is is returning an empty string.
Here is a sample line from the file. This is public info.
12/6/2017,8:30 AM,2013FA000060,In RE: the Support of: D.D.K. and A.P.C.,MH,Motion hearing,"Grill, Leo",State of Wisconsin,"Vesely, Tori A; Lawton, Mark David";
In working with my Regular expression based code for the first column, I have the correct expected string data for strTemp prior to the offending code:
12/6/2017,
Here is the code to remove the comma at the end:
cleanData = strTemp.slice(0, -1); // remove last 1 character
I have verified that strTemp is correct right before this statement is executed to make sure the string var assignments are not the problem (as seen above).
The expected result should be strTemp data without the comma at the end:
12/6/2017
I receive no errors. Just an empty string.
OK, so it looks like the data coming from the regular expression execute is not a string. Once I cast it to string it worked
cleanData = strTemp.toString().slice(0, -1);
I really do NOT like working with weakly typed languages!!
Related
The description field is a text area field, somehow a user ended up with some strange little symbol in it. (see image)
When I grab this from the server, I assemble my data from the objects I grab, which includes the description on this object, and turn it into JSON string, and send it to my javascript.
From javascript, I JSON.parse it. But that weird little symbol causes the parse to fail. But, when you look at it, there is no character there or anything, yet it throws an undefined character in JSON.parse.
My response from the server has the description like this:
"blahblahtesttext\r\nslkdjf",
There is nothing but the expected \r\n......
But it has an unexpected token where that symbol is.
{"value":"blah blah test text//Symbol should be here, but there is nothing and it forces it to the next line
\r\nslkdjf","fieldType":"TEXTAREA","field":"Description"}
Where that symbol forces the string to the next line, which causes the issue.
Because I can't see what the actual character is... I do not know how to handle this.
Is there something that can strip out invalid characters in a JSON string so the parse works? I don't want to just try/catch this as it would toss out everything, I just want that weird invalid symbol to be stripped out.
Or is there a way to see what the actual character is that JSON.parse does not like?
<-- here is that symbol for copy pasting into a string if you want to try parsing it.
EDIT:
I found that it was doing this in Notepad++
Where you can see that where the line separator was, it is placing actual carriage return and line feed there, breaking the string. It already has \r\n\r\n for the two returns that were placed in the actual text area after that line separator character.
But still unsure of how to deal with this, as that carriage return and line feed do not appear in the string as '\n\r', there is no character representation of them, but instead it actually puts a return there and breaks the string.
NEW EDIT:
Finally found something to get this working. I couldn't do a replace on that line separator character. When I pulled it from my database, it came through as a hidden carriage return. When you manually pressed 'Enter' in the text area, the string I got from the database would actually put a '\r\n' there. But the line separator did not.
So, I added these three lines before parsing to ensure I was escaping any invalid new lines/carriage returns.
result = result.replace(/\r\n/g, '\\r\\n');
result = result.replace(/\r/g, '\\r');
result = result.replace(/\n/g, '\\n');
The '\r\n' that were actually in the string would correctly be escaped already, which tripped me up because I didn't have to worry about escaping those until someone tried introducing this line separator....
As Xufox says, that appears to be U+2028. JSON.parse shouldn't fail on it since U+2028 doesn't require escaping in JSON; Chrome's doesn't, but that's probably because it's implementing this stage 4 proposal Xufox pointed out:
const o = {prop: "testing\u2028one two three"};
console.log(JSON.parse(JSON.stringify(o)));
If you need to work around a JSON.parse implementation that doesn't handle it, you could do this:
str = str.replace(/\u2028/g, "\\u2028");
...before running JSON.parse on str.
I have a long xml raw message that is being stored in a string format. A sample is as below.
<tag1>val</tag><tag2>val</tag2><tagSomeNameXYZ/>
I'm looking to search this string and find out if it contains an empty html tag such as <tagSomeNameXYZ/>. This thing is, the value of SomeName can change depending on context. I've tried using Str.match(/tagSomeNameXYZ/g) and Str.match(/<tag.*.XYZ\/>/g) to find out if it contains exactly that string, but am able to get it return anything. I'm having trouble in writing a reg ex that matches something like <tag*XYZ/>, where * is going to be SomeName (which I'm not interested in)
Tl;dr : How do I filter out <tagSomeNameXYZ/> from the string. Format being : <constant variableName constant/>
Example patterns that it should match:
<tagGetIndexXYZ/>
<tagGetAllIndexXYZ/>
<tagGetFooterXYZ/>
The issue you have with Str.match(/<tag.*.XYZ\/>/g) is the .* takes everything it sees and does not stop at the XYZ as you wish. So you need to find a way to stop (e.g. the [^/]* means keep taking until you find a /) and then work back from there (the slice).
Does this help
testString = "<tagGetIndexXYZ/>"
res = testString.match(/<tag([^/]*)\/\>/)[1].slice(0,-3)
console.log(res)
I am getting the following json object when I call the URL from Browser which I expect no data in it.
"{\"data\":[], \"SkipToken\":\"\", \"top\":\"\"}"
However, when I tried to call it in javascript it gives me error Parsing Json message
dspservice.callService(URL, "GET", "", function (data) {
var dataList = JSON.parse(data);
)};
This code was working before I have no idea why all of a sudden stopped working and throwing me error.
You say the server is returning the JSON (omitting the enclosing quotes):
{\"data\":[], \"SkipToken\":\"\", \"top\":\"\"}
This is invalid JSON. The quote marks in JSON surrounding strings and property names should not be preceded by a backslash. The backslash in JSON is strictly for inserting double quote marks inside a string. (It can also be used to escape other characters inside strings, but that is not relevant here.)
Correct JSON would be:
{"data":[], "SkipToken":"", "top":""}
If your server returned this, it would parse correctly.
The confusion here, and the reports by other posters that it seems like your string should work, lies in the fact that in a simple-minded test, where I type this string into the console:
var x = "{\"data\":[], \"SkipToken\":\"\", \"top\":\"\"}";
the JavaScript string literal escaping mechanism, which is entirely distinct from the use of escapes in JSON, results in a string with the value
{"data":[], "SkipToken":"", "top":""}
which of course JSON.parse can handle just fine. But Javascript string escaping applies to string literals in source code, not to things coming down from the server.
To fix the server's incorrectly-escaped JSON, you have two possibilities. One is to tell the server guys they don't need to (and must not) put backslashes before quote marks (except for quote marks inside strings). Then everything will work.
The other approach is to undo the escaping yourself before handing it off to JSON.parse. A first cut at this would be a simple regexp such as
data.replace(/\\"/g, '"')
as in
var dataList = JSON.parse(data.replace(/\\"/g, '"')
It might need additional tweaking depending on how the server guys are escaping quotes inside strings; are they sending \"\\"\", or possibly \"\\\"\"?
I cannot explain why this code that was working suddenly stopped working. My best guess is a change on the server side that started escaping the double quotes.
Since there is nothing wrong with the JSON string you gave us, the only other explanation is that the data being passed to your function is something other than what you listed.
To test this hypothesis, run the following code:
dspservice.callService(URL, "GET", "", handler(data));
function handler(data) {
var goodData = "{\"data\":[], \"SkipToken\":\"\", \"top\":\"\"}";
alert(goodData); // display the correct JSON string
var goodDataList = JSON.parse(goodData); // parse good string (should work)
alert(data); // display string in question
var dataList = JSON.parse(data); // try to parse it (should fail)
}
If the goodData JSON string can be parsed with no issues, and data appears to be incorrectly-formatted, then you have the answer to your question.
Place a breakpoint on the first line of the handler function, where goodData is defined. Then step through the code. From what you told me in your comments, it is still crashing during a JSON parse, but I'm willing to wager that it is failing on the second parse and not the first.
Did you mean that your JSON is like this?
"{\"data\":[], \"SkipToken\":\"\", \"top\":\"\"}"
Then data in your callback would be like this:
'"{\"data\":[], \"SkipToken\":\"\", \"top\":\"\"}"'
Because data is the fetched text content string.
You don't have to add extra quotes in your JSON:
{"data":[], "SkipToken":"", "top":""}
Thanks for looking!
BACKGROUND
I am writing some front-end code that consumes a JSON service which is returning malformed JSON. Specifically, the keys are not surrounded with quotes:
{foo: "bar"}
I have NO CONTROL over the service, so I am correcting this like so:
var scrubbedJson = dirtyJson.replace(/(['"])?([a-zA-Z0-9_]+)(['"])?:/g, '"$2": ');
This gives me well formed JSON:
{"foo": "bar"}
Problem
However, when I call JSON.parse(scrubbedJson), I still get an error. I suspect it may be because the entire JSON string is surrounded in double quotes but I am not sure.
UPDATE
This has been solved--the above code works fine. I had a rogue single quote in the body of the JSON that was returned. I got that out of there and everything now parses. Thanks.
Any help would be appreciated.
You can avoid using a regexp altogether and still output a JavaScript object from a malformed JSON string (keys without quotes, single quotes, etc), using this simple trick:
var jsonify = (function(div){
return function(json){
div.setAttribute('onclick', 'this.__json__ = ' + json);
div.click();
return div.__json__;
}
})(document.createElement('div'));
// Let's say you had a string like '{ one: 1 }' (malformed, a key without quotes)
// jsonify('{ one: 1 }') will output a good ol' JS object ;)
Here's a demo: http://codepen.io/csuwldcat/pen/dfzsu (open your console)
something like this may help to repair the json ..
$str='{foo:"bar"}';
echo preg_replace('/({)([a-zA-Z0-9]+)(:)/','$1"$2"${3}',$str);
Output:
{"foo":"bar"}
EDIT:
var str='{foo:"bar"}';
str.replace(/({)([a-zA-Z0-9]+)(:)/,'$1"$2"$3')
There is a project that takes care of all kinds of invalid cases in JSON https://github.com/freethenation/durable-json-lint
I was trying to solve the same problem using a regEx in Javascript. I have an app written for Node.js to parse incoming JSON, but wanted a "relaxed" version of the parser (see following comments), since it is inconvenient to put quotes around every key (name). Here is my solution:
var objKeysRegex = /({|,)(?:\s*)(?:')?([A-Za-z_$\.][A-Za-z0-9_ \-\.$]*)(?:')?(?:\s*):/g;// look for object names
var newQuotedKeysString = originalString.replace(objKeysRegex, "$1\"$2\":");// all object names should be double quoted
var newObject = JSON.parse(newQuotedKeysString);
Here's a breakdown of the regEx:
({|,) looks for the beginning of the object, a { for flat objects or , for embedded objects.
(?:\s*) finds but does not remember white space
(?:')? finds but does not remember a single quote (to be replaced by a double quote later). There will be either zero or one of these.
([A-Za-z_$\.][A-Za-z0-9_ \-\.$]*) is the name (or key). Starts with any letter, underscore, $, or dot, followed by zero or more alpha-numeric characters or underscores or dashes or dots or $.
the last character : is what delimits the name of the object from the value.
Now we can use replace() with some dressing to get our newly quoted keys:
originalString.replace(objKeysRegex, "$1\"$2\":")
where the $1 is either { or , depending on whether the object was embedded in another object. \" adds a double quote. $2 is the name. \" another double quote. and finally : finishes it off.
Test it out with
{keyOne: "value1", $keyTwo: "value 2", key-3:{key4:18.34}}
output:
{"keyOne": "value1","$keyTwo": "value 2","key-3":{"key4":18.34}}
Some comments:
I have not tested this method for speed, but from what I gather by reading some of these entries is that using a regex is faster than eval()
For my application, I'm limiting the characters that names are allowed to have with ([A-Za-z_$\.][A-Za-z0-9_ \-\.$]*) for my 'relaxed' version JSON parser. If you wanted to allow more characters in names (you can do that and still be valid), you could instead use ([^'":]+) to mean anything other than double or single quotes or a colon. You can have all sorts of stuff in here with this expression, so be careful.
One shortcoming is that this method actually changes the original incoming data (but I think that's what you wanted?). You could program around that to mitigate this issue - depends on your needs and resources available.
Hope this helps.
-John L.
How about?
function fixJson(json) {
var tempString, tempJson, output;
tempString = JSON.stringify(json);
tempJson = JSON.parse(tempString);
output = JSON.stringify(tempJson);
return output;
}
We're having a lot of trouble tracking down the source of \u2028 (Line Separator) in user submitted data which causes the 'unterminated string literal' error in Firefox.
As a result, we're looking at filtering it out before submitting it to the server (and then the database).
After extensive googling and reading of other people's problems, it's clear I have to filter these characters out before submitting to the database.
Before writing the filter, I attempted to search for the character just to ensure it can find it using:
var index = content.search("/\u2028/");
alert("Index: [" + index + "]");
I get -1 as the result everytime, even when I know the character is in the content variable (I've confirmed via a Java jUnit test on the server side).
Assuming that content.replace() would work the same way as search(), is there something I'm doing wrong or anything I'm missing in order to find and strip these line separators?
Your regex syntax is incorrect. You only use the two forward slashes when using a regex literal. It should be just:
var index = content.search("\u2028");
or:
var index = content.search(/\u2028/); // regex literal
But this should really be done on the server, if anywhere. JavaScript sanitization can be trivially bypassed. It's only useful for user convenience, and I don't think accidentally entering line separator is that common.