Why are endline characters illegal in HTML string sent over ajax? - javascript

Within HTML, it is okay to have endline characters. But when I try to send HTML strings that have endline characters over AJAX to have them operated with JavaScript/jQuery, it returns an error that says that endline characters are illegal. For example, if I have a Ruby string:
"<div>Hello</div>"
and jsonify it with Ruby by to_json, and send it over ajax, parse it within JavaScript by JSON.parse, and insert that in jQuery like:
$('body').append('<div>Hello</div>');
then it does not return an error, but if I do a similar thing with a string like
"<div>Hello\n</div>"
it returns an error. Why are they legal in HTML and illegal in AJAX? Are there any other differences between a legal HTML string loaded as a page and legal HTML string sent over ajax?

string literals can contain line breaks, they just need to be escaped with a backslash like so:
var string = "hello\
world!";
However, this does not create a line break in the string, as it must be an explicit \n escape sequence. This would technically become helloworld. Doing
var string = "hello"
+ "world"
would be much cleaner

Specify the type of the ajax call as 'html'. Jquery will try to infer the type when parsing the response.
If the response is json, newlines should be escaped.
I'd recommend using a library to serialize json. You're unlikely to handle all the edge cases if you roll your own.

Strings in JavaScript MUST appear on a single line, with the exception of escaping that line:
var str = "abc \
def";
However note that the newline is escaped and will not appear in the string itself.
The best option is \n, but note that if it is already going through something that parses \n then you will need to double-escape it as \\n.

Seeing how you're already escaping the JSON properly by using to_json in Ruby, I do believe the bug is in jQuery; when there are newlines in the string it has trouble determining whether you meant to create a single element or a document fragment. This would work just fine:
var str = "<div>Hello\n</div>";
var wrapper = document.createElement('div');
wrapper.innerHTML = str;
$('body').append(wrapper);
Demo

Related

How to convert JSON escaped string into plain HTML-compatible string

I am using an API to compile code, and when there is an error, the response containing the error message uses JSON escape characters, but when outputting it back into the HTML front-end, it just produces garbage characters. How can I either convert the escaped string to a plain text string using Javascript, or output it in HTML correctly?
This is what the string looks like properly outputted (in Powershell):
https://i.imgur.com/tv0BZFl.jpg
This is the escaped string:
\u001b[01m\u001b[K:\u001b[m\u001b[K In function '\u001b[01m\u001b[Kin...
This is what the string looks like if I directly output it in HTML:
[01m[K:[m[K In function '[01m[Kint main()[m[K':
[01m[K:9:1:[m[K [01;31m[Kerror: [m[Kexpected '[01m[K;[m[K' before '[01m[K}[m[K' token
}
[01;32m[K ^[m[K
Looks like you can use the strip-ansi package. Here's an example using your escaped string:
const stripAnsi = require('strip-ansi');
stripAnsi("\u001b[01m\u001b[K:\u001b[m\u001b[K In function '\u001b[01m\u001b[Kin...")
// result => ": In function 'in..."
If you aren't using node.js, or cannot use that package for whatever reason, this Stack Overflow answer has a regular expression you may be able to use instead.
Just found this tool too:
https://www.npmjs.com/package/ansi-to-html
which converts ANSI to html.

Javascript How to escape \u in string literal

Strange thing...
I have a string literal that is passed to my source code as a constant token (I cannot prehandle or escape it beforehand).
Example
var username = "MYDOMAIN\tom";
username = username.replace('MYDOMAIN','');
The string somewhere contains a backslash followed by a character.
It's too late to escape the backslash at this point, so I have to escape these special characters individually like
username = username.replace(/\t/ig, 't');
However, that does not work in the following scenario:
var username = "MYDOMAIN\ulrike";
\u seems to introduce a unicode character sequence. \uLRIK cannot be interpreted as a unicode sign so the Javascript engine stops interpreting at this point and my replace(/\u/ig,'u') comes too late.
Has anybody a suggestion or workaround on how to escape such a non-unicode character sequence contained in a given string literal? It seems a similar issue with \b like in "MYDOMAIN\bernd".
I have a string literal that is passed to my source code
Assuming you don't have any < or >, move this to inside an HTML control (instead of inside your script block) or element and use Javacript to read the value. Something like
<div id="myServerData">
MYDOMAIN\tom
</div>
and you retrieve it so
alert(document.getElementById("myServerData").innerText);
IMPORTANT : injecting unescaped content, where the user can control the content (say this is data entered in some other page) is a security risk. This goes for whether you are injecting it in script or HTML
Writing var username = "MYDOMAIN\ulrike"; will throw a syntax error. I think you have this string coming from somewhere.
I would suggest creating some html element and setting it's innerHTML to the received value, and then picking it up.
Have something like:
<div id="demo"></div>
Then do document.getElementById("demo").innerHTML = username;
Then read the value from there as document.getElementById("demo").innerHTML;
This should work I guess.
Important: Please make sure this does not expose the webpage to script injections. If it does, this method is bad, don't use it.

JSON.parse failing on valid Json. Have escaped control characters.If

I've escaped control characters and am feeding my validated JSON into JSON.parse and jQuery.parseJSON. Both are giving the same result.
Getting error message "Unexpected token $":
$(function(){
try{
$.parseJSON('"\\\\\"$\\\\\"#,##0"');
} catch (exception) {
alert(exception.message);
}
});
<script src="https://ajax.googleapis.com/ajax/libs/jquery/1.8.3/jquery.min.js"></script>
Thanks for checking out this issue.
What's happening here is that there are two levels of backslash removal being applied to the string. The first is done by the browser's JavaScript engine when it parses the single-quoted string. In JavaScript, single-quoted strings and double-quoted strings are exactly equivalent (other than the fact that single-quotes must be backslash-escaped in single-quoted strings and double-quotes must be backslash-escaped in double-quoted strings); both types of strings take backslash escape codes such as \\ for backslash, \' for single-quote (redundant but accepted in double-quoted strings), and \" for double-quote (redundant but accepted in single-quoted strings).
In your JavaScript single-quoted string literal you have several instances of this kind of thing, which are meant to be valid JSON double-quoted strings:
"\\\\\"$\\\\\"#,##0"
After the browser has parsed it, the string contains exactly the following characters (including the outer double-quotes, which are unremoved because they are contained in a single-quoted string):
"\\"$\\"#,##0"
You can see that each consecutive pair of backslashes became a single literal backslash, and the two cases of an odd backslash followed by a double-quote each became a literal double-quote.
That is the text that is being passed as an argument to $.parseJSON, which is when the second level of backslash removal occurs. During JSON parsing of the above text, the leading double-quote signifies the start of a JSON string literal, then the pair of backslashes is interpreted as a single literal backslash, and then the immediately following double-quote terminates the JSON string literal. The stuff that follows (dollar, backslash, backslash, etc.) is invalid JSON syntax.
The problem is that you've embedded valid JSON in a JavaScript single-quoted string literal, which, although it happens to be valid JavaScript syntax by fluke (it wouldn't have been if the JSON contained single-quotes, or if you'd tried using double-quotes to delimit the JavaScript string literal), no longer contains valid JSON after being parsed by the browser's JavaScript engine.
To solve the problem, you have to either manually escape the JSON content to be properly embedded in a JavaScript string literal, or load it independently of the JavaScript source, e.g. from a flat file.
Here's a demonstration of how to solve the problem using your latest example code:
$(function() {
try {
alert($.parseJSON('{"key":"\\\\\\\\\\"$\\\\\\\\\\"#,##0"}').key); // works
alert($.parseJSON('{"key":"\\\\\"$\\\\\"#,##0"}').key); // doesn't work
} catch (exception) {
alert(exception.message);
}
});
http://jsfiddle.net/814uw638/2/
Since JavaScript has a simple escaping scheme (e.g. see http://blogs.learnnowonline.com/2012/07/19/escape-sequences-in-string-literals-using-javascript/), it's actually pretty easy to solve this problem in the general case. You just have to decide in advance how you're going to quote the string in JavaScript (single-quotes are a good idea, because strings in JSON are always double-quoted), and then when you prepare the JavaScript source, just add a backslash before every single-quote and every backslash in the embedded JSON. That should guarantee it will be perfectly valid, regardless of the exact JSON content (provided, of course, that it is valid JSON to begin with).
In your original problem, why do you need to do JSONparse in the first place? You could have easily gotten the object you wanted by just doing
var o = { blah }
by manually removing the single quotes you have around the curly braces rather than doing
$.JSONparse('{blah}')
Is there any reason for evaluating the string first (ie var s = '{blah}' and then doing $.JSONparse(s)) which is what your original code was doing? There shouldn't be a case where this is necessary. Since you mentioned somewhere that the string was produced by JSON.stringify, there shouldn't be a scenario where you need to explicitly store it into a variable (ie copy and paste it and put quotes around it).
The main problem here is the string produced by JSON.stringify, which is properly escaped, has been 'evaluated' once when you manually put braces around it. So the key is to make sure the string doesn't get 'evaluated'
Even if you wanted to pass the stringified variable to database or anything, there is no need to explicitly use quotes. One could do
var s = JSON.stringify(obj);
db.save("myobj",s)
var newObj = JSON.parse(db.load("myobj"))
The string is stored verbatim without getting evaluated, so that when you retrieve it, you would have the exact same string.

Escaping JavaScript special characters from ASP.NET

I have following C# code in my ASP.NET application:
string script = #"alert('Message head:\n\n" + CompoundErrStr + " message tail.');";
System.Web.UI.ScriptManager.RegisterClientScriptBlock(this, this.GetType(), "Test", script, true);
CompoundErrStr is an error message generated by SQL Server (exception text bubbled up from the stored procedure). If it contains any table column names they are enclosed in single quotes and JavaScript breaks during execution because single quotes are considered a string terminator.
As a fix for single quotes I changed my code to this:
CompoundErrStr = CompoundErrStr.Replace("'", #"\'");
string script = #"alert('Message head:\n\n" + CompoundErrStr + " message tail.');";
System.Web.UI.ScriptManager.RegisterClientScriptBlock(this, this.GetType(), "Test", script, true);
and it now works fine.
However, are there any other special characters that need to be escaped like this? Is there a .Net function that can be used for this purpose? Something similar to HttpServerUtility.HtmlEncode but for JavaScript.
EDIT I use .Net 3.5
Note: for this task you can't (and you shouldn't) use HTML encoders (like HttpServerUtility.HtmlEncode()) because rules for HTML and for JavaScript strings are pretty different. One example: string "Check your Windows folder c:\windows" will be encoded as "Check your Windows folder c:'windows" and it's obviously wrong. Moreover it follows HTML encoding rules then it won't perform any escaping for \, " and '. Simply it's for something else.
If you're targeting ASP.NET Core or .NET 5 then you should use System.Text.Encodings.Web.JavaScriptEncoder class.
If you're targeting .NET 4.x you can use HttpUtility.JavaScriptStringEncode() method.
If you're targeting .NET 3.x and 2.x:
What do you have to encode? Some characters must be escaped (\, " and ') because they have special meaning for JavaScript parser while others may interfere with HTML parsing so should escaped too (if JS is inside an HTML page). You have two options for escaping: JavaScript escape character </kbd> or \uxxxx Unicode code points (note that \uxxxx may be used for them all but it won't work for characters that interferes with HTML parser).
You may do it manually (with search and replace) like this:
string JavaScriptEscape(string text)
{
return text
.Replace("\\", #"\u005c") // Because it's JS string escape character
.Replace("\"", #"\u0022") // Because it may be string delimiter
.Replace("'", #"\u0027") // Because it may be string delimiter
.Replace("&", #"\u0026") // Because it may interfere with HTML parsing
.Replace("<", #"\u003c") // Because it may interfere with HTML parsing
.Replace(">", #"\u003e"); // Because it may interfere with HTML parsing
}
Of course </kbd> should not be escaped if you're using it as escape character! This blind replacement is useful for unknown text (like input from users or text messages that may be translated). Note that if string is enclosed with double quotes then single quotes don't need to be escaped and vice-versa). Be careful to keep verbatim strings on C# code or Unicode replacement will be performed in C# and your client will receive unescaped strings. A note about interfere with HTML parsing: nowadays you seldom need to create a <script> node and to inject it in DOM but it was a pretty common technique and web is full of code like + "</s" + "cript>" to workaround this.
Note: I said blind escaping because if your string contains an escape sequence (like \uxxxx or \t) then it should not be escaped again. For this you have to do some tricks around this code.
If your text comes from user input and it may be multiline then you should also be ready for that or you'll have broken JavaScript code like this:
alert("This is a multiline
comment");
Simply add .Replace("\n", "\\n").Replace("\r", "") to previous JavaScriptEscape() function.
For completeness: there is also another method, if you encode your string Uri.EscapeDataString() then you can decode it in JavaScript with decodeURIComponent() but this is more a dirty trick than a solution.
While the original question mentions .NET 3.5, it should be known to users of 4.0+ that you can use HttpUtility.JavaScriptStringEncode("string")
A second bool parameter specifies whether to include quotation marks (true) or not (false) in the result.
All too easy:
#Html.Raw(myString)

jQuery.parseJSON throws “Invalid JSON” error due to escaped single quote in JSON

I’m making requests to my server using jQuery.post() and my server is returning JSON objects (like { "var": "value", ... }). However, if any of the values contains a single quote (properly escaped like \'), jQuery fails to parse an otherwise valid JSON string. Here’s an example of what I mean (done in Chrome’s console):
data = "{ \"status\": \"success\", \"newHtml\": \"Hello \\\'x\" }";
eval("x = " + data); // { newHtml: "Hello 'x", status: "success" }
$.parseJSON(data); // Invalid JSON: { "status": "success", "newHtml": "Hello \'x" }
Is this normal? Is there no way to properly pass a single quote via JSON?
According to the state machine diagram on the JSON website, only escaped double-quote characters are allowed, not single-quotes. Single quote characters do not need to be escaped:
Update - More information for those that are interested:
Douglas Crockford does not specifically say why the JSON specification does not allow escaped single quotes within strings. However, during his discussion of JSON in Appendix E of JavaScript: The Good Parts, he writes:
JSON's design goals were to be minimal, portable, textual, and a subset of JavaScript. The less we need to agree on in order to interoperate, the more easily we can interoperate.
So perhaps he decided to only allow strings to be defined using double-quotes since this is one less rule that all JSON implementations must agree on. As a result, it is impossible for a single quote character within a string to accidentally terminate the string, because by definition a string can only be terminated by a double-quote character. Hence there is no need to allow escaping of a single quote character in the formal specification.
Digging a little bit deeper, Crockford's org.json implementation of JSON for Java is more permissible and does allow single quote characters:
The texts produced by the toString methods strictly conform to the JSON syntax rules. The constructors are more forgiving in the texts they will accept:
...
Strings may be quoted with ' (single quote).
This is confirmed by the JSONTokener source code. The nextString method accepts escaped single quote characters and treats them just like double-quote characters:
public String nextString(char quote) throws JSONException {
char c;
StringBuffer sb = new StringBuffer();
for (;;) {
c = next();
switch (c) {
...
case '\\':
c = this.next();
switch (c) {
...
case '"':
case '\'':
case '\\':
case '/':
sb.append(c);
break;
...
At the top of the method is an informative comment:
The formal JSON format does not allow strings in single quotes, but an implementation is allowed to accept them.
So some implementations will accept single quotes - but you should not rely on this. Many popular implementations are quite restrictive in this regard and will reject JSON that contains single quoted strings and/or escaped single quotes.
Finally to tie this back to the original question, jQuery.parseJSON first attempts to use the browser's native JSON parser or a loaded library such as json2.js where applicable (which on a side note is the library the jQuery logic is based on if JSON is not defined). Thus jQuery can only be as permissive as that underlying implementation:
parseJSON: function( data ) {
...
// Attempt to parse using the native JSON parser first
if ( window.JSON && window.JSON.parse ) {
return window.JSON.parse( data );
}
...
jQuery.error( "Invalid JSON: " + data );
},
As far as I know these implementations only adhere to the official JSON specification and do not accept single quotes, hence neither does jQuery.
If you need a single quote inside of a string, since \' is undefined by the spec, use \u0027 see http://www.utf8-chartable.de/ for all of them
edit: please excuse my misuse of the word backticks in the comments. I meant backslash. My point here is that in the event you have nested strings inside other strings, I think it can be more useful and readable to use unicode instead of lots of backslashes to escape a single quote. If you are not nested however it truly is easier to just put a plain old quote in there.
I understand where the problem lies and when I look at the specs its clear that unescaped single quotes should be parsed correctly.
I am using jquery`s jQuery.parseJSON function to parse the JSON string but still getting the parse error when there is a single quote in the data that is prepared with json_encode.
Could it be a mistake in my implementation that looks like this (PHP - server side):
$data = array();
$elem = array();
$elem['name'] = 'Erik';
$elem['position'] = 'PHP Programmer';
$data[] = json_encode($elem);
$elem = array();
$elem['name'] = 'Carl';
$elem['position'] = 'C Programmer';
$data[] = json_encode($elem);
$jsonString = "[" . implode(", ", $data) . "]";
The final step is that I store the JSON encoded string into an JS variable:
<script type="text/javascript">
employees = jQuery.parseJSON('<?=$marker; ?>');
</script>
If I use "" instead of '' it still throws an error.
SOLUTION:
The only thing that worked for me was to use bitmask JSON_HEX_APOS to convert the single quotes like this:
json_encode($tmp, JSON_HEX_APOS);
Is there another way of tackle this issue? Is my code wrong or poorly written?
Thanks
When You are sending a single quote in a query
empid = " T'via"
empid =escape(empid)
When You get the value including a single quote
var xxx = request.QueryString("empid")
xxx= unscape(xxx)
If you want to search/ insert the value which includes a single quote in a query
xxx=Replace(empid,"'","''")
Striking a similar issue using CakePHP to output a JavaScript script-block using PHP's native json_encode. $contractorCompanies contains values that have single quotation marks and as explained above and expected json_encode($contractorCompanies) doesn't escape them because its valid JSON.
<?php $this->Html->scriptBlock("var contractorCompanies = jQuery.parseJSON( '".(json_encode($contractorCompanies)."' );"); ?>
By adding addslashes() around the JSON encoded string you then escape the quotation marks allowing Cake / PHP to echo the correct javascript to the browser. JS errors disappear.
<?php $this->Html->scriptBlock("var contractorCompanies = jQuery.parseJSON( '".addslashes(json_encode($contractorCompanies))."' );"); ?>
I was trying to save a JSON object from a XHR request into a HTML5 data-* attribute. I tried many of above solutions with no success.
What I finally end up doing was replacing the single quote ' with it code ' using a regex after the stringify() method call the following way:
var productToString = JSON.stringify(productObject);
var quoteReplaced = productToString.replace(/'/g, "'");
var anchor = '<a data-product=\'' + quoteReplaced + '\' href=\'#\'>' + productObject.name + '</a>';
// Here you can use the "anchor" variable to update your DOM element.
Interesting. How are you generating your JSON on the server end? Are you using a library function (such as json_encode in PHP), or are you building the JSON string by hand?
The only thing that grabs my attention is the escape apostrophe (\'). Seeing as you're using double quotes, as you indeed should, there is no need to escape single quotes. I can't check if that is indeed the cause for your jQuery error, as I haven't updated to version 1.4.1 myself yet.

Categories

Resources