Replace Options From AJAX Response in XSS Safe Manner

Replace Options From AJAX Response in XSS Safe Manner - javascript

The bounty expires in 3 days. Answers to this question are eligible for a +500 reputation bounty.
Neil wants to draw more attention to this question.
From within the succcess method of my AJAX response, my goal is to do the following in an XSS safe manner:
remove all existing options within a select box.
replace the options from that same select box.
Here is one way to remove and replace the options, but I don't have high confidence that this strategy is entirely XSS safe:
success: function (data) {
$('#mySelBox').children().remove();
$.each(data, function (index, value) {
$('#mySelBox').append('<option value="' + value.id + '">' + value.description + '</option>');
});
}
Specifically:
I'm not sure if value.id is XSS safe in that context.
I'm not sure if value.description is safe in that context.
Referencing the OWASP XSS cheatsheet):
[Ensure] that all variables go through validation and are then escaped or sanitized is known as perfect injection resistance.
To that end here are my questions:
What is the sure way to escape and sanitize value.id in the above context?
What is the sure way to escape and sanitize value.description in the above context?
I also found this XSS prevention article. It made me aware of how complicated XSS prevention can be because there is not one single solution to the problem: the solution depends entirely upon the context.

Instead of concatenated HTML strings, use the DOM API to create the <option> element:
$.each(data, function (index, value) {
var opt = document.createElement("option");
opt.setAttribute("value", value.id);
opt.textContent = value.description;
MY-SELECT-BOX.append(opt);
});

HTML ESCAPING
When dealing with potentially dangerous input, you need to escape HTML before updating the DOM, as in this example malicious string input value:
"<script>alert('hacked');</script>"
If you render it directly within an options element, the malicious script executes:
<select id='mySelBox'>
<option value='firstId'>first value</option>
<option value='secondId'><script>alert('hacked');</script></option>
</select>
By using escaping, you ensure that the malicious value above is handled as text and not executable code. To write the escaping code yourself, see this answer, whose code I'm repeating below:
function escapeHtml(unsafe)
{
return unsafe
.replace(/&/g, "&")
.replace(/</g, "<")
.replace(/>/g, ">")
.replace(/"/g, """)
.replace(/'/g, "'");
}
TEMPLATING LIBRARIES
In my opinion a cleaner way to do this is via a small templating library. One example that shows an approach for updating the DOM safely is mustache.js.
const myViewModel = {
myOptions: [
{
myId: 'firstId',
myValue: 'first value',
},
{
myId: 'secondId',
myValue: "<script>alert('hacked');</script>",
}
]
};
const myHtmlTemplate = `
{{#myOptions}}
<option id='{{myId}}'>{{myValue}}</option>
{{/myOptions}}
`;
const safeHtml = mustache.render(myHtmlTemplate, myViewModel);
const selBox = document.querySelector('#mySelBox');
if (selBox) {
selBox.innerHTML = safeHtml;
}
The select box would then be rendered safely, like this:
WEB FRAMEWORKS
Web frameworks such as React and Angular also handle HTML safely, in a similar way. Whatever your preferences, using some kind of respected library or framework is usually recommended, due to the best practice guidance that comes with it.

Related

Create object from text

I have javascript text:
var textObject = '
{
news: [
{
"title":"aaa",
"desc":"bbb"
}, {
"title":"ccc",
"desc":"ddd"
} ]
};
'
but this is in text in my variable. If i have this in code html this working ok, but i get this data with ajax from PHP script.
So how can i convert/parse this text to object? If i have JSON then i can use JSON.parse(textObject); but this isn't json.

Eval is frowned upon for a lot of reasons, however it also has its benefits if used properly, it is used for a lot of template engines and a few other things but it will convert a string to an object.
var someString = '{obj: "with content"}';
eval( someString );
Here is a working example with your string: http://jsfiddle.net/kkemple/CwzRh/

Using eval can result in serious performance degradation.
Since you can't do JSON, then use the Function constructor instead so that the evaling takes place in the global scope, and the browsers can still optimize the local code.
var result = new Function("return " + textObject.trim())();
You'll need to shim .trim() to support IE8. If the string is as you show with line breaks at the beginning, then the .trim() will be necessary.

Better Way to Sanitize HTML for Insertion

In a recent review by the AMO editors, my Thunderbird addon's version was rejected because it "creates HTML from strings containing unsanitized data" - which "is a major security risk".
I think I understand why. Now, my problem is about how to solve that issue.
This thread gave me some clues, but it's not quite what I need.
My addon needs to paste the contents of the clipboard as a hyperlink, by using the clipboard contents as the link text, and inserting html around it like this: `" + clipboardtext + "".
Now, if I am inserting the clipboard contents as HTML, I need to "sanitize" it first. Here is what I came up with. Now, I haven't written in the regex part yet, because I don't think this is the best way to do this, although I think it will work:
function makeSafeHTML(whathtml){
var parser = Cc["#mozilla.org/parserutils;1"].getService(Ci.nsIParserUtils);
var sanitizedHTML = parser.sanitize(whathtml, 01);
//now remove the extratags added by the sanitization method, perhaps via regex
//"<html><head></head><body>"
//"</body></html>"
return sanitizedHTML;
}
My intent is to do this with the resulting "sanitized" string - this will paste the string as the href value of a hyperlink:
var html_editor = editor.QueryInterface(Components.interfaces.nsIHTMLEditor);
html_editor.insertHTML("<a href='"+whathref+"'>"+whattext+"</a>");
So I am looking for a better way to get sanitized HTML into a simple string variable. Would any of you do it this way?

It seems that you simply want to insert clipboard contents into HTML code as pure text - you don't need any complicated escaping approach then, it's enough to make sure all "dangerous" characters are replaced by HTML entities:
var sanitizedText = text.replace(/&/g, "&").replace(/</g, "<")
.replace(/>/g, ">").replace(/"/g, """);
It's not clear from your question what you do with the generated HTML code. If you add it to a DOM document via something like innerHTML then you can do better - add the HTML code first and manipulate the text in the document then:
document.getElementById("text-container").textContent = text;
Using Node.textContent to set text in a document is always safe, no escaping needs to be performed.

Backbone: Should model.escape be used instead of model.get?

I was doing some reading on Cross-Site Scripting (XSS) attacks today. It seems that Backbone has model.escape('attr') built in and from what I can tell that should always be used instead of model.get('attr') to prevent these attacks.
I did some initial searching but didn't find any recommendations of the sort. Should I always use model.escape('attr') when retrieving values from a model?

Using Underscore templates, I've generally seen/done it like this:
var TemplateHtml = "<div><%- someModelAttribute %></div>"; // Really, you should load from file using something like RequireJS
var View = Backbone.View.extend({
_template: _.template(TemplateHtml),
render: function() {
this.$el.html(this._template(this.model.toJSON()));
}
});
When you use <%- someModelAttribute %>, Underscore knows to escape the given values (as opposed to <%= someModelAttribute %> which injects the attribute directly without escaping).

Instead of model.escape(), see _.escape while rendering.
So, you can use your models as you want but be careful to escape while rendering.
It is enough to just use _.escape in your template while rendering.
This avoids XSS attacks.
See this method:
http://underscorejs.org/#escape

Yes, to aviod xss attacks you may always use model.escape() which is preferable and it is also used to escape the html contents...
But if you are going to use the data straight away... you can simplt use model.get()...

I found a good article on when to use the backbone escape function. The author asserts that you should always use escape, apart from when you are definitely not going to be executing the value of a model attribute. For example if you were checking a model attribute was not null:
var model = new Backbone.Model({foo: "Bar"});
if (model.get("foo") != null) { //notice how here we did not use escape
$("h1").html(model.escape("foo")); //but here we do
}
One related point to be aware of is that if you check for the returned value from model.escape("foo") it will always return a string. So if you are expecting null then you may be confused.
console.log(model.get("foo")); // null
console.log(model.escape("foo")); // ""
However, as Jeremy Ashkenas points out in a pull report querying this issue, it does not make sense to check the existence of an attribute after escaping it.

How to generate RDF/XML using a template in javascript?

I have some working Javascript code that generates an RDF/XML document using variables picked up from HTML fields:
...
peopleDoap += " <foaf:name>" + person_name + "</foaf:name>\n";
if (person_url != "") {
peopleDoap += " <foaf:homepage rdf:resource=\"" + person_url + "/\"/>\n";
}
if (person_pic != "") {
peopleDoap += " <foaf:depiction rdf:resource=\"" + person_pic + "/\"/>\n";
}
...
It's hard, looking at this code, to get any sense of what the output will look like (especially as this code is scattered amongst sub functions etc).
I'm wondering if there is an easy way that would enable me to have something like this:
...
<foaf:name>%person_name%</foaf_name>
<foaf:homepage rdf:resource="%person_url%"/>
<foaf:depiction rdf:resource="%person_pic%"/>
...
And then some substitution code. One slight complication is if fields are left blank, I will want to not generate the whole element. Ie, if person_url='', the above should generate as:
...
<foaf:name>%person_name%</foaf_name>
<foaf:depiction rdf:resource="%person_pic%"/>
...
I guess I could do this fairly naively by defining the template as a huge string, then performing a bunch of replaces on it, but is there anything more elegant? Mild preference for native Javascript rather than libraries, but happy to be convinced...
(Btw, yes, since this is RDF/XML, maybe there is a smarter way using some kind of RDF library. If you want to address that of the question instead, that's ok with me.)
Code is here.
Also, this is a widget running on a Jetty server. I don't think server-side code is an option.

I recommend using:
jQuery Templates
Mustache
John Resig's Micro-Templating
jQuery Templates are very powerful and nicely integrated with jQuery. That means that you can do things like this:
$.tmpl("Hello ${n}", {n: "World"}).appendTo('h1');
for the most simple stuff, or define templates in your HTML inside special script tags with custom MIME types, compile them, populate them with JSON data from AJAX calls, etc.

To add a bit of follow-up, I did implement John Resig's Micro-Templating (actually the refined version I posted earlier). However, I then backpedalled a bit. I found implementing control structures in the template is less readable than outside:
...
'<description xml:lang="en">#description</description>';
if (homepage) t +=
'<homepage rdf:resource="#homepage"/>';
...
rather than:
...
'<description xml:lang="en"><#= description #></description>' +
'<# if (homepage) { #>' +
'<homepage rdf:resource="<#= homepage =>"/>' +
'<# } #>';
...
I also ditched the microtemplating code for a simple substitution of variables, using #var rather than <# var #>.
Readability of templates like this is really critical, so I've done everything I could think of. In particular, keeping the javascript outside of the template lets syntax highlighting work, which is valuable to me.
That John Resig post also suggested burying the template in your HTML, in a but I preferred to keep it in my javascript, which is a separate .js.

How to decode HTML entities using jQuery?

How do I use jQuery to decode HTML entities in a string?

Security note: using this answer (preserved in its original form below) may introduce an XSS vulnerability into your application. You should not use this answer. Read lucascaro's answer for an explanation of the vulnerabilities in this answer, and use the approach from either that answer or Mark Amery's answer instead.
Actually, try
var encodedStr = "This is fun & stuff";
var decoded = $("<div/>").html(encodedStr).text();
console.log(decoded);
<script src="https://cdnjs.cloudflare.com/ajax/libs/jquery/3.3.1/jquery.min.js"></script>
<div/>

Without any jQuery:
function decodeEntities(encodedString) {
var textArea = document.createElement('textarea');
textArea.innerHTML = encodedString;
return textArea.value;
}
console.log(decodeEntities('1 & 2')); // '1 & 2'
This works similarly to the accepted answer, but is safe to use with untrusted user input.
Security issues in similar approaches
As noted by Mike Samuel, doing this with a <div> instead of a <textarea> with untrusted user input is an XSS vulnerability, even if the <div> is never added to the DOM:
function decodeEntities(encodedString) {
var div = document.createElement('div');
div.innerHTML = encodedString;
return div.textContent;
}
// Shows an alert
decodeEntities('<img src="nonexistent_image" onerror="alert(1337)">')
However, this attack is not possible against a <textarea> because there are no HTML elements that are permitted content of a <textarea>. Consequently, any HTML tags still present in the 'encoded' string will be automatically entity-encoded by the browser.
function decodeEntities(encodedString) {
var textArea = document.createElement('textarea');
textArea.innerHTML = encodedString;
return textArea.value;
}
// Safe, and returns the correct answer
console.log(decodeEntities('<img src="nonexistent_image" onerror="alert(1337)">'))
Warning: Doing this using jQuery's .html() and .val() methods instead of using .innerHTML and .value is also insecure* for some versions of jQuery, even when using a textarea. This is because older versions of jQuery would deliberately and explicitly evaluate scripts contained in the string passed to .html(). Hence code like this shows an alert in jQuery 1.8:
//<!-- CDATA
// Shows alert
$("<textarea>")
.html("<script>alert(1337);</script>")
.text();
//-->
<script src="https://ajax.googleapis.com/ajax/libs/jquery/1.2.3/jquery.min.js"></script>
* Thanks to Eru Penkman for catching this vulnerability.

Like Mike Samuel said, don't use jQuery.html().text() to decode html entities as it's unsafe.
Instead, use a template renderer like Mustache.js or decodeEntities from #VyvIT's comment.
Underscore.js utility-belt library comes with escape and unescape methods, but they are not safe for user input:
_.escape(string)
_.unescape(string)

I think you're confusing the text and HTML methods. Look at this example, if you use an element's inner HTML as text, you'll get decoded HTML tags (second button). But if you use them as HTML, you'll get the HTML formatted view (first button).
<div id="myDiv">
here is a <b>HTML</b> content.
</div>
<br />
<input value="Write as HTML" type="button" onclick="javascript:$('#resultDiv').html($('#myDiv').html());" />
<input value="Write as Text" type="button" onclick="javascript:$('#resultDiv').text($('#myDiv').html());" />
<br /><br />
<div id="resultDiv">
Results here !
</div>
First button writes : here is a HTML content.
Second button writes : here is a <B>HTML</B> content.
By the way, you can see a plug-in that I found in jQuery plugin - HTML decode and encode that encodes and decodes HTML strings.

The question is limited by 'with jQuery' but it might help some to know that the jQuery code given in the best answer here does the following underneath...this works with or without jQuery:
function decodeEntities(input) {
var y = document.createElement('textarea');
y.innerHTML = input;
return y.value;
}

You can use the he library, available from https://github.com/mathiasbynens/he
Example:
console.log(he.decode("Jörg &amp Jürgen rocked to & fro "));
// Logs "Jörg & Jürgen rocked to & fro"
I challenged the library's author on the question of whether there was any reason to use this library in clientside code in favour of the <textarea> hack provided in other answers here and elsewhere. He provided a few possible justifications:
If you're using node.js serverside, using a library for HTML encoding/decoding gives you a single solution that works both clientside and serverside.
Some browsers' entity decoding algorithms have bugs or are missing support for some named character references. For example, Internet Explorer will both decode and render non-breaking spaces ( ) correctly but report them as ordinary spaces instead of non-breaking ones via a DOM element's innerText property, breaking the <textarea> hack (albeit only in a minor way). Additionally, IE 8 and 9 simply don't support any of the new named character references added in HTML 5. The author of he also hosts a test of named character reference support at http://mathias.html5.org/tests/html/named-character-references/. In IE 8, it reports over one thousand errors.
If you want to be insulated from browser bugs related to entity decoding and/or be able to handle the full range of named character references, you can't get away with the <textarea> hack; you'll need a library like he.
He just darn well feels like doing things this way is less hacky.

encode:
$("<textarea/>").html('<a>').html(); // return '<a&gt'
<script src="https://cdnjs.cloudflare.com/ajax/libs/jquery/3.3.1/jquery.min.js"></script>
<textarea/>
decode:
$("<textarea/>").html('<a&gt').val() // return '<a>'
<script src="https://cdnjs.cloudflare.com/ajax/libs/jquery/3.3.1/jquery.min.js"></script>
<textarea/>

Try this :
var htmlEntities = "<script>alert('hello');</script>";
var htmlDecode =$.parseHTML(htmlEntities)[0]['wholeText'];
console.log(htmlDecode);
<script src="https://cdnjs.cloudflare.com/ajax/libs/jquery/3.3.1/jquery.min.js"></script>
parseHTML is a Function in Jquery library and it will return an array that includes some details about the given String..
in some cases the String is being big, so the function will separate the content to many indexes..
and to get all the indexes data you should go to any index, then access to the index called "wholeText".
I chose index 0 because it's will work in all cases (small String or big string).

Use
myString = myString.replace( /\&/g, '&' );
It is easiest to do it on the server side because apparently JavaScript has no native library for handling entities, nor did I find any near the top of search results for the various frameworks that extend JavaScript.
Search for "JavaScript HTML entities", and you might find a few libraries for just that purpose, but they'll probably all be built around the above logic - replace, entity by entity.

I just had to have an HTML entity charater (⇓) as a value for a HTML button. The HTML code looks good from the beginning in the browser:
<input type="button" value="Embed & Share ⇓" id="share_button" />
Now I was adding a toggle that should also display the charater. This is my solution
$("#share_button").toggle(
function(){
$("#share").slideDown();
$(this).attr("value", "Embed & Share " + $("<div>").html("⇑").text());
}
This displays ⇓ again in the button. I hope this might help someone.

You have to make custom function for html entities:
function htmlEntities(str) {
return String(str).replace(/&/g, '&').replace(/</g, '<').replace(/>/g,'>').replace(/"/g, '"');
}

Suppose you have below String.
Our Deluxe cabins are warm, cozy & comfortable
var str = $("p").text(); // get the text from <p> tag
$('p').html(str).text(); // Now,decode html entities in your variable i.e
str and assign back to tag.
that's it.

For ExtJS users, if you already have the encoded string, for example when the returned value of a library function is the innerHTML content, consider this ExtJS function:
Ext.util.Format.htmlDecode(innerHtmlContent)

Extend a String class:
String::decode = ->
$('<textarea />').html(this).text()
and use as method:
"<img src='myimage.jpg'>".decode()

You don't need jQuery to solve this problem, as it creates a bit of overhead and dependency.
I know there are a lot of good answers here, but since I have implemented a bit different approach, I thought to share.
This code is a perfectly safe security-wise approach, as the escaping handler depends on the browser, instead on the function. So, if some vulnerability will be discovered in the future, this solution is covered.
const decodeHTMLEntities = text => {
// Create a new element or use one from cache, to save some element creation overhead
const el = decodeHTMLEntities.__cache_data_element
= decodeHTMLEntities.__cache_data_element
|| document.createElement('div');
const enc = text
// Prevent any mixup of existing pattern in text
.replace(/⪪/g, '⪪#')
// Encode entities in special format. This will prevent native element encoder to replace any amp characters
.replace(/&([a-z1-8]{2,31}|#x[0-9a-f]+|#\d+);/gi, '⪪$1⪫');
// Encode any HTML tags in the text to prevent script injection
el.textContent = enc;
// Decode entities from special format, back to their original HTML entities format
el.innerHTML = el.innerHTML
.replace(/⪪([a-z1-8]{2,31}|#x[0-9a-f]+|#\d+)⪫/gi, '&$1;')
.replace(/⪪#/g, '⪪');
// Get the decoded HTML entities
const dec = el.textContent;
// Clear the element content, in order to preserve a bit of memory (in case the text is big)
el.textContent = '';
return dec;
}
// Example
console.log(decodeHTMLEntities("<script>alert('&awconint;&CounterClockwiseContourIntegral;∳∳⪪#x02233⪫');</script>"));
// Prints: <script>alert('∳∳∳∳⪪#x02233⪫');</script>
By the way, I have chosen to use the characters ⪪ and ⪫, because they are rarely used, so the chance of impacting the performance by matching them is significantly lower.

Here are still one problem:
Escaped string does not look readable when assigned to input value
var string = _.escape("<img src=fake onerror=alert('boo!')>");
$('input').val(string);
Exapmle: https://jsfiddle.net/kjpdwmqa/3/

Alternatively, theres also a library for it..
here, https://cdnjs.com/libraries/he
npm install he //using node.js
<script src="js/he.js"></script> //or from your javascript directory
The usage is as follows...
//to encode text
he.encode('© Ande & Nonso® Company LImited 2018');
//to decode the
he.decode('© Ande & Nonso® Company Limited 2018');
cheers.

To decode HTML Entities with jQuery, just use this function:
function html_entity_decode(txt){
var randomID = Math.floor((Math.random()*100000)+1);
$('body').append('<div id="random'+randomID+'"></div>');
$('#random'+randomID).html(txt);
var entity_decoded = $('#random'+randomID).html();
$('#random'+randomID).remove();
return entity_decoded;
}
How to use:
Javascript:
var txtEncoded = "á é í ó ú";
$('#some-id').val(html_entity_decode(txtEncoded));
HTML:
<input id="some-id" type="text" />

The easiest way is to set a class selector to your elements an then use following code:
$(function(){
$('.classSelector').each(function(a, b){
$(b).html($(b).text());
});
});
Nothing any more needed!
I had this problem and found this clear solution and it works fine.

I think that is the exact opposite of the solution chosen.
var decoded = $("<div/>").text(encodedStr).html();

Develop Reference

JavaScript is the programming language of the Web.