Escaping JavaScript special characters from ASP.NET - javascript

I have following C# code in my ASP.NET application:
string script = #"alert('Message head:\n\n" + CompoundErrStr + " message tail.');";
System.Web.UI.ScriptManager.RegisterClientScriptBlock(this, this.GetType(), "Test", script, true);
CompoundErrStr is an error message generated by SQL Server (exception text bubbled up from the stored procedure). If it contains any table column names they are enclosed in single quotes and JavaScript breaks during execution because single quotes are considered a string terminator.
As a fix for single quotes I changed my code to this:
CompoundErrStr = CompoundErrStr.Replace("'", #"\'");
string script = #"alert('Message head:\n\n" + CompoundErrStr + " message tail.');";
System.Web.UI.ScriptManager.RegisterClientScriptBlock(this, this.GetType(), "Test", script, true);
and it now works fine.
However, are there any other special characters that need to be escaped like this? Is there a .Net function that can be used for this purpose? Something similar to HttpServerUtility.HtmlEncode but for JavaScript.
EDIT I use .Net 3.5

Note: for this task you can't (and you shouldn't) use HTML encoders (like HttpServerUtility.HtmlEncode()) because rules for HTML and for JavaScript strings are pretty different. One example: string "Check your Windows folder c:\windows" will be encoded as "Check your Windows folder c:'windows" and it's obviously wrong. Moreover it follows HTML encoding rules then it won't perform any escaping for \, " and '. Simply it's for something else.
If you're targeting ASP.NET Core or .NET 5 then you should use System.Text.Encodings.Web.JavaScriptEncoder class.
If you're targeting .NET 4.x you can use HttpUtility.JavaScriptStringEncode() method.
If you're targeting .NET 3.x and 2.x:
What do you have to encode? Some characters must be escaped (\, " and ') because they have special meaning for JavaScript parser while others may interfere with HTML parsing so should escaped too (if JS is inside an HTML page). You have two options for escaping: JavaScript escape character </kbd> or \uxxxx Unicode code points (note that \uxxxx may be used for them all but it won't work for characters that interferes with HTML parser).
You may do it manually (with search and replace) like this:
string JavaScriptEscape(string text)
{
return text
.Replace("\\", #"\u005c") // Because it's JS string escape character
.Replace("\"", #"\u0022") // Because it may be string delimiter
.Replace("'", #"\u0027") // Because it may be string delimiter
.Replace("&", #"\u0026") // Because it may interfere with HTML parsing
.Replace("<", #"\u003c") // Because it may interfere with HTML parsing
.Replace(">", #"\u003e"); // Because it may interfere with HTML parsing
}
Of course </kbd> should not be escaped if you're using it as escape character! This blind replacement is useful for unknown text (like input from users or text messages that may be translated). Note that if string is enclosed with double quotes then single quotes don't need to be escaped and vice-versa). Be careful to keep verbatim strings on C# code or Unicode replacement will be performed in C# and your client will receive unescaped strings. A note about interfere with HTML parsing: nowadays you seldom need to create a <script> node and to inject it in DOM but it was a pretty common technique and web is full of code like + "</s" + "cript>" to workaround this.
Note: I said blind escaping because if your string contains an escape sequence (like \uxxxx or \t) then it should not be escaped again. For this you have to do some tricks around this code.
If your text comes from user input and it may be multiline then you should also be ready for that or you'll have broken JavaScript code like this:
alert("This is a multiline
comment");
Simply add .Replace("\n", "\\n").Replace("\r", "") to previous JavaScriptEscape() function.
For completeness: there is also another method, if you encode your string Uri.EscapeDataString() then you can decode it in JavaScript with decodeURIComponent() but this is more a dirty trick than a solution.

While the original question mentions .NET 3.5, it should be known to users of 4.0+ that you can use HttpUtility.JavaScriptStringEncode("string")
A second bool parameter specifies whether to include quotation marks (true) or not (false) in the result.

All too easy:
#Html.Raw(myString)

Related

Javascript How to escape \u in string literal

Strange thing...
I have a string literal that is passed to my source code as a constant token (I cannot prehandle or escape it beforehand).
Example
var username = "MYDOMAIN\tom";
username = username.replace('MYDOMAIN','');
The string somewhere contains a backslash followed by a character.
It's too late to escape the backslash at this point, so I have to escape these special characters individually like
username = username.replace(/\t/ig, 't');
However, that does not work in the following scenario:
var username = "MYDOMAIN\ulrike";
\u seems to introduce a unicode character sequence. \uLRIK cannot be interpreted as a unicode sign so the Javascript engine stops interpreting at this point and my replace(/\u/ig,'u') comes too late.
Has anybody a suggestion or workaround on how to escape such a non-unicode character sequence contained in a given string literal? It seems a similar issue with \b like in "MYDOMAIN\bernd".
I have a string literal that is passed to my source code
Assuming you don't have any < or >, move this to inside an HTML control (instead of inside your script block) or element and use Javacript to read the value. Something like
<div id="myServerData">
MYDOMAIN\tom
</div>
and you retrieve it so
alert(document.getElementById("myServerData").innerText);
IMPORTANT : injecting unescaped content, where the user can control the content (say this is data entered in some other page) is a security risk. This goes for whether you are injecting it in script or HTML
Writing var username = "MYDOMAIN\ulrike"; will throw a syntax error. I think you have this string coming from somewhere.
I would suggest creating some html element and setting it's innerHTML to the received value, and then picking it up.
Have something like:
<div id="demo"></div>
Then do document.getElementById("demo").innerHTML = username;
Then read the value from there as document.getElementById("demo").innerHTML;
This should work I guess.
Important: Please make sure this does not expose the webpage to script injections. If it does, this method is bad, don't use it.

Why are endline characters illegal in HTML string sent over ajax?

Within HTML, it is okay to have endline characters. But when I try to send HTML strings that have endline characters over AJAX to have them operated with JavaScript/jQuery, it returns an error that says that endline characters are illegal. For example, if I have a Ruby string:
"<div>Hello</div>"
and jsonify it with Ruby by to_json, and send it over ajax, parse it within JavaScript by JSON.parse, and insert that in jQuery like:
$('body').append('<div>Hello</div>');
then it does not return an error, but if I do a similar thing with a string like
"<div>Hello\n</div>"
it returns an error. Why are they legal in HTML and illegal in AJAX? Are there any other differences between a legal HTML string loaded as a page and legal HTML string sent over ajax?
string literals can contain line breaks, they just need to be escaped with a backslash like so:
var string = "hello\
world!";
However, this does not create a line break in the string, as it must be an explicit \n escape sequence. This would technically become helloworld. Doing
var string = "hello"
+ "world"
would be much cleaner
Specify the type of the ajax call as 'html'. Jquery will try to infer the type when parsing the response.
If the response is json, newlines should be escaped.
I'd recommend using a library to serialize json. You're unlikely to handle all the edge cases if you roll your own.
Strings in JavaScript MUST appear on a single line, with the exception of escaping that line:
var str = "abc \
def";
However note that the newline is escaped and will not appear in the string itself.
The best option is \n, but note that if it is already going through something that parses \n then you will need to double-escape it as \\n.
Seeing how you're already escaping the JSON properly by using to_json in Ruby, I do believe the bug is in jQuery; when there are newlines in the string it has trouble determining whether you meant to create a single element or a document fragment. This would work just fine:
var str = "<div>Hello\n</div>";
var wrapper = document.createElement('div');
wrapper.innerHTML = str;
$('body').append(wrapper);
Demo

Is it enough to use HTMLEncode to display uploaded text?

We're allowing users to upload pictures and provide a text description. Users can view this through a pop up box (actually a div ) via javascript. The uploaded text is a parameter to a javascript function. I 'm worried about XSS and also finding issues with HTMLEncode().
We're using HTMLEncode to guard against XSS. Unfortunately, we're finding that HTMLEncode() only replaces '<' and '>'. We also need to replace single and double quotes that people may include. Is there a single function that will do all these special type characters or must we do that manually via .NET string.Replace()?
Unfortunately, we're finding that HTMLEncode() only replaces '<' and '>'.
Assuming you are talking about HttpServerUtility.HtmlEncode, that does encode the double-quote character. It also encodes as character references the range U+0080 to U+00FF, for some reason.
What it doesn't encode is the single quote. Bit of a shame but you can usually work around it by using only double quotes as attribute value delimiters in your HTML/XML. In that case, HtmlEncode is enough to prevent HTML-injection.
However, javascript is in your tags, and HtmlEncode is decidedly not enough to escape content to go in a JavaScript string literal. JavaScript-encoding is a different thing to HTML-encoding, so if that's the reason you're worried about the single quote then you need to employ a JS string encoder instead.
(A JSON encoder is a good start for that, but you would want to ensure it encodes the U+2028 and U+2029 characters which are, annoyingly, valid in JSON but not in JavaScript. Also you might well need some variety of HTML-escaping on top of that, if you have JavaScript in an HTML context. This can get hairy; it's usually better to avoid these problems by hiding the content you want in plain HTML, for example in a hidden input or custom attribute, where you can use standard HTML-escaping, and then read that data from the DOM in JS.)
If the text description is embedded inside a JavaScript string literal, then to prevent XSS, you will need to escape special characters such as quotes, backslashes, and newlines. The HttpUtility.HtmlEncode method is not suitable for this task.
If the JavaScript string literal is in turn embedded inside HTML (for example, in an attribute), then you will need to apply HTML encoding as well, on top of the JavaScript escaping.
You can use Microsoft's Anti-Cross Site Scripting library to perform the necessary escaping and encoding, but I recommend that you try to avoid doing this yourself. For example, if you're using WebForms, consider using an <asp:HiddenField> control: Set its Value property (which will be HTML-encoded automatically) in your server-side code, and access its value property from client-side code.
how about you htmlencode all of the input with this extended function:
private string HtmlEncode(string text)
{
char[] chars = HttpUtility.HtmlEncode(text).ToCharArray();
StringBuilder result = new StringBuilder(text.Length + (int)(text.Length * 0.1));
foreach (char c in chars)
{
int value = Convert.ToInt32(c);
if (value > 127)
result.AppendFormat("&#{0};", value);
else
result.Append(c);
}
return result.ToString();
}
this function will convert all non-english characters, symbols, quotes, etc to html-entities..
try it out and let me know if this helps..
If you're using ASP.NET MVC2 or ASP.NET 4 you can replace <%= with <%: to encode your output. It's safe to use for everything it seems (like HTML Helpers).
There is a good write up of this here: New <%: %> Syntax for HTML Encoding Output in ASP.NET 4 (and ASP.NET MVC 2)

set a text from a java object with new lines to a javascript variable in JSP

I've a Java String with new lines(\n), say for example
String value = "This is a variable\n\nfrom\nJava";
Now I've to set this to a Javascript variable in a JSP file,
<script>var val = '<%= value %>';</script>
But because of the new lines in the above line, I'm getting javascript error "Unterminated String".
Please help me.
Use StringEscapeUtils#escapeEcmaScript() before printing it to JSP.
Newlines will be only one issue. To properly escape the string for display as a JavaScript literal, you have to handle newlines and a wide variety of other characters (not least backslashes and whatever quotes you're using). This isn't hard, but it's non-trivial. Effectively you need to search the string for a range of values (regular expressions are useful here) and substitute the JavaScript escape code (\n, etc.) for it. To avoid charset issues, when doing this sort of thing I escape anything that isn't ASCII into either the JavaScript named escape (\n) or a Unicode escape (\u1234).

Jquery embedded quote in attribute

I have a custom attribute that is being filled from a database. This attribute can contain an embedded single quote like this,
MYATT='Tony\'s Test'
At some pont in my code I use jquery to copy this attribute to a field like this,
$('#MY_DESC').val($(recdata).attr('MYATT'));
MY_DESC is a text field in a dialog box. When I display the dialog box all I see in the field is
Tony\
What I need to see is,
Tony's Test
How can I fix this so I can see the entire string?
Try:
MYATT='Tony&#x27s Test'
I didn't bother verifying this with the HTML spec, but the wikipedia entry says:
The ability to "escape" characters in this way allows for the characters < and & (when written as < and &, respectively) to be interpreted as character data, rather than markup. For example, a literal < normally indicates the start of a tag, and & normally indicates the start of a character entity reference or numeric character reference; writing it as & or & or & allows & to be included in the content of elements or the values of attributes. The double-quote character ("), when used to quote an attribute value, must also be escaped as " or " or " when it appears within the attribute value itself. The single-quote character ('), when used to quote an attribute value, must also be escaped as ' or ' (should NOT be escaped as &apos; except in XHTML documents) when it appears within the attribute value itself. However, since document authors often overlook the need to escape these characters, browsers tend to be very forgiving, treating them as markup only when subsequent text appears to confirm that intent.
In case you won't use double-quotes, put your custom attribute into them :)
If not, I suggest escape the value.
Before setting the value of your text field, you might try running a regular expression against the string to remove all backslashes from the string.
If you do this:
alert($(recdata).attr('MYATT'));
You will see the same result of "Tony\" meaning that the value isn't being properly consumed by the browser. The escaped \' value isn't working in this case.
Do you have the means to edit these values as they are being produced? Can you parse them to include escape values before being rendered?

Categories

Resources