html entity is not rendered - javascript

If I just put in XUL file
<label value="°C"/>
it works fine. However, I need to assing ° value to that label element and it doesn't show degree symbol, instead literal value.
UPD
sorry guys, I just missed couple words here - it doesn't work from within javascript - if I assign mylablel.value = degree + "°" - this will show literal value.
It does show degree symbol only if I put above manually in XUL file.

What happens when you use a JavaScript escape, like "\u00B0C", instead of "°C"?
Or when using mylabel.innerHTML instead of mylabel.value? (According to MDC, this should be possible.)
EDIT: you can convert those entities to JavaScript escapes using the Unicode Code Converter.

This makes sense to me. When you express the entity in an attribute value within XML markup, the XML parser interpolates the entity reference and then sets the label value to the result. From Javascript, however, there's no XML parser to do that work for you, and in fact life would be pretty nasty if there were! Note that when you set the value attribute (from Javascript) of an <input type='text'> element, you don't have to worry about having to escape XML entities (or even angle brackets, for that matter). However, you do have to worry about XML entities when you're setting the "value" attribute within XML markup.
Another way to think about it is this: XML entity notation is XML syntax, not Javascript syntax. In Javascript, you can produce special characters using 16-bit Unicode escape sequences, which look like \u followed by a four-digit hex constant. As noted in Marcel Korpel's answer, if you know what Unicode value is produced by the XML entity, then you should be able to use that directly from Javascript. In this case, you could use "\u00B0".

This way it will not work ,can you convert it to be like this
<label>°C</label>

Related

how to set ASCII code to the button when the page loads

I'm designing chess board in HTML. &#9814 is the code to display the WHITE ROOK.
I'm trying to set the value while page loads and it is displaying it as a string, but the ROOK is not coming on the button
function load() {
document.getElementById('A1').value="&#9814";
}
function load() {
document.getElementById('A1').innerHTML="♖";
}
http://jsfiddle.net/RM5VD/2/
The notation &#9814 or ♖ has no special meaning in JavaScript; they are just strings of characters, though these strings can be assigned to the innerHTML property, causing HTML parsing.
The simplest way use a Unicode character in JavaScript to insert it as such, though this requires a suitable editor and the use of the UTF-8 character encoding. Example:
document.getElementById('A1').value = '♖';
The next simplest is to use the JavaScript escape notation, namely \u followed by exactly four hexadecimal digits. Since WHITE CHESS ROOK is U+2656 (2656 hex = 9815 decimal), you would use this:
document.getElementById('A1').value="\u2656";
This makes sense only if the element modified has the value property as per HTML specs. For example, <input type=button> has it, but button doesn’t. But this affects just the left hand of the assignment, i.e. what you assign the string to.
Beware that font support to chess piece characters like this is rather limited. Moreover, browsers may have their own ideas of the font to be used in buttons. In practice, you should probably use some downloadable font.
You need to rewrite the function like this:
function load() {
document.getElementById('A1').value=String.fromCharCode(9814);
}
It's not clear exactly what kind of element you're modifying, but you may need to modify the innerHTML instead of the value depending on your situation.
The way you have it, it is passing the text as a literal string, not a representation of a single character.
jsFiddle example

Selecting element by unescaped data attribute

Without going into specifics why I'm doing this... (it should be encoded to begin with, but it's not for reasons outside my control)
Say I have a bit of HTML that looks like this
<tr data-path="files/kissjake's files"">...</tr> so the actual data-path is files/kissjake's files"
How do I go about selecting that <tr> by its data path?
The best I can currently do is when I bring the variables into JS and do any manipulation, I URLEncode it so that I'm always working with the encoded version. jQuery seems smart enough to determine the data-path properly so I'm not worried about that.
The problem is on one step of the code I need to read from a data-path of another location, and then compare them.
Actually selecting this <tr> is what's confusing me.
Here is my coffeescript
oldPriority = $("tr[data-path='#{path}']").attr('data-priority')
If I interpolate the URLEncoded version of the path, it doesn't find the TR. And I can't URLDecode it because then jQuery breaks as there are multiple ' and " conflicting in the path.
I need some way to select any <tr> that matches a particular data-attribute, even if its not encoded in the html to begin with
First, did you mean to have the extra " in there? You will have to escape that, as it's not valid HTML.
<tr data-path="files/kissjake's files"">...</tr>
To select it, you need to escape inside the selector. Here's an example of how that would look:
$("tr[data-path='files/kissjake\\'s files\"']")
Explanation:
\\' is used to escape the ' inside the CSS selector. Since ' is inside other single quotes, it must be escaped at the CSS level. The reason there are two slashes '\` is we must escape a slash so that it makes it into the selector string.
Simpler example: 'John\\'s' yields the string John\'s.
\" is used to escape the double quote which is contained inside the other double quotes. This one is being escaped on the JS level (not the CSS level), so only one slash is used because we don't need a slash to actually be inside the string contents.
Simpler example: 'Hello \"World\"' yields the string Hello "World".
Update
Since you don't have control over how the HTML is output, and you are doomed to deal with invalid HTML, that means the extra double quote should be ignored. So you can instead do:
$("tr[data-path='files/kissjake\\'s files']")
Just the \\' part to deal with the single quote. The extra double quote should be handled by the browser's lenient HTML parser.
Building off of #Nathan Wall's answer, this will select all <tr> tags with a data-path attribute on them.
$("tr[data-path]");

what kind of encoding is this?

I've got some data from dbpedia using jena and since jena's output is based on xml so there are some circumstances that xml characters need to be treated differently like following :
Guns n &#039; Roses
I just want to know what kind of econding is this?
I want decode/encode my input based on above encode(r) with the help of javascript and send it back to a servlet.
(edited post if you remove the space between & and amp you will get the correct character since in stackoverflow I couldn't find a way to do that I decided to put like that!)
Seems to be XML entity encoding, and a numeric character reference (decimal).
A numeric character reference refers to a character by its Universal
Character Set/Unicode code point, and uses the format
You can get some info here: List of XML and HTML character entity references on Wikipedia.
Your character is number 39, being the apostrophe: ', which can also be referenced with a character entity reference: &apos;.
To decode this using Javascript, you could use for example php.js, which has an html_entity_decode() function (note that it depends on get_html_translation_table()).
UPDATE: in reply to your edit: Basically that is the same, the only difference is that it was encoded twice (possibly by mistake). & is the ampersand: &.
This is an SGML/HTML/XML numeric character entity reference.
In this case for an apostrophe '.

Is it enough to use HTMLEncode to display uploaded text?

We're allowing users to upload pictures and provide a text description. Users can view this through a pop up box (actually a div ) via javascript. The uploaded text is a parameter to a javascript function. I 'm worried about XSS and also finding issues with HTMLEncode().
We're using HTMLEncode to guard against XSS. Unfortunately, we're finding that HTMLEncode() only replaces '<' and '>'. We also need to replace single and double quotes that people may include. Is there a single function that will do all these special type characters or must we do that manually via .NET string.Replace()?
Unfortunately, we're finding that HTMLEncode() only replaces '<' and '>'.
Assuming you are talking about HttpServerUtility.HtmlEncode, that does encode the double-quote character. It also encodes as character references the range U+0080 to U+00FF, for some reason.
What it doesn't encode is the single quote. Bit of a shame but you can usually work around it by using only double quotes as attribute value delimiters in your HTML/XML. In that case, HtmlEncode is enough to prevent HTML-injection.
However, javascript is in your tags, and HtmlEncode is decidedly not enough to escape content to go in a JavaScript string literal. JavaScript-encoding is a different thing to HTML-encoding, so if that's the reason you're worried about the single quote then you need to employ a JS string encoder instead.
(A JSON encoder is a good start for that, but you would want to ensure it encodes the U+2028 and U+2029 characters which are, annoyingly, valid in JSON but not in JavaScript. Also you might well need some variety of HTML-escaping on top of that, if you have JavaScript in an HTML context. This can get hairy; it's usually better to avoid these problems by hiding the content you want in plain HTML, for example in a hidden input or custom attribute, where you can use standard HTML-escaping, and then read that data from the DOM in JS.)
If the text description is embedded inside a JavaScript string literal, then to prevent XSS, you will need to escape special characters such as quotes, backslashes, and newlines. The HttpUtility.HtmlEncode method is not suitable for this task.
If the JavaScript string literal is in turn embedded inside HTML (for example, in an attribute), then you will need to apply HTML encoding as well, on top of the JavaScript escaping.
You can use Microsoft's Anti-Cross Site Scripting library to perform the necessary escaping and encoding, but I recommend that you try to avoid doing this yourself. For example, if you're using WebForms, consider using an <asp:HiddenField> control: Set its Value property (which will be HTML-encoded automatically) in your server-side code, and access its value property from client-side code.
how about you htmlencode all of the input with this extended function:
private string HtmlEncode(string text)
{
char[] chars = HttpUtility.HtmlEncode(text).ToCharArray();
StringBuilder result = new StringBuilder(text.Length + (int)(text.Length * 0.1));
foreach (char c in chars)
{
int value = Convert.ToInt32(c);
if (value > 127)
result.AppendFormat("&#{0};", value);
else
result.Append(c);
}
return result.ToString();
}
this function will convert all non-english characters, symbols, quotes, etc to html-entities..
try it out and let me know if this helps..
If you're using ASP.NET MVC2 or ASP.NET 4 you can replace <%= with <%: to encode your output. It's safe to use for everything it seems (like HTML Helpers).
There is a good write up of this here: New <%: %> Syntax for HTML Encoding Output in ASP.NET 4 (and ASP.NET MVC 2)

filtering escaped angle brackets in javascript

I have a javascript feature that allows users to place arbitrary text strings on a page. I don't want them to be able to insert html or other code, just plain text.
So I figure that stripping out all angle brackets(< >) would do the trick. (I don't care if they have 'broken' html on the page, or that they're not able to put angle brackets in their text) Then I realized I had to filter escaped angle brackets (< >) and probably others.
What all do I need to filter out, for security? Will removing all angle brackets do the trick?
Will removing all angle brackets do the trick?
Just replace all angle brackets with their escaped form. That way, people can write as much "code" as they like, and it just shows up as plain-text instead.
Make sure that the first thing you do is replace & with &
a) For HTML content, just < should be enough.
b) For attribute values, for example if it is going in <input name="sendtoserver" value="custom text"/> you need to take care of double-quotes, but that is all that is necessary. Still it is good to also do < and >.
It depends on the context. If you want to play it safe, tell your JavaScript to use innerText which does not need encoding, but you may want to set the css to white-space:pre-wrap. This is less error prone, but also less browser-compatible.
c) On a loosely related note, when escaping JavaScript strings terminators using backslashes, The item that might sneak up on you is if you place content in a script, you need to take care of </script> (not case sensitive) You can just escape </ or / should be enough

Categories

Resources