A question regarding ng-bind-html whilst upgrading an Angular app from 1.0.8 to 1.2.8:
I have locale strings stored in files named en_GB.json, fr_FR.json, etc. So far, I have allowed the use of HTML within the locale strings to allow the team writing the localized content to apply basic styling or adding inline anchor tags. This would result in the following example JSON:
{
"changesLater": "<strong>Don't forget</strong> that you can always make changes later."
"errorEmailExists": "That email address already exists, please sign in to continue."
}
When using these strings with ng-bind-html="myStr", I understand that I now need to use $sce.trustAsHtml(myStr). I could even write a filter as suggested in this StackOverflow answer which would result in using ng-bind-html="myStr | unsafe".
Questions:
By doing something like this, is my app now insecure? And if so, how might an attacker exploit this?
I can understand potential exploits if the source of the displayed HTML string was a user (ie. blog post-style comments that will be displayed to other users), but would my app really be at risk if I'm only displaying HTML from a JSON file hosted on the same domain?
Is there any other way I should be looking to achieve the marking-up of externally loaded content strings in an angular app?
You are not making your app any less secure. You were already inserting HTML in your page with the old method of ng-bind-html-unsafe. You are still doing the same thing, except now you have to explicitly trust the source of the HTML rather than just specifying that part of your template can output raw HTML. Requiring the use of $sce makes it harder to accidentally accept raw HTML from an untrusted source - in the old method where you only declared the trust in the template, bad input might make its way into your model in ways you didn't think of.
If the content comes from your domain, or a domain you control, then you're safe - at least as safe as you can be. If someone is somehow able to highjack the payload of a response from your own domain, then your security is already all manner of screwed. Note, however, you should definitely not ever call $sce.trustAsHtml on content that comes from a domain that isn't yours.
Apart from maintainability concerns, I don't see anything wrong with the way you're doing it. Having a ton of HTML live in a JSON file is maybe not ideal, but as long as the markup is reasonably semantic and not too dense, I think it's fine. If the markup becomes significantly more complex, I'd consider splitting it into separate angular template files or directives as needed, rather than trying to manage a bunch of markup wrapped in JSON strings.
Related
I am getting some html from my server that I want to put into my page. However I want it to be sanitized (just in case).
However I am not quite sure how to do this.
So far I've tried:
<div .innerHTML="${body}"></div>
Since that should parse it as HTML but I am not 100% sure that this is the best way.
I have also looked at online sanitizers but haven't been able to find any that match my project (Lit-element web component).
Is there a better way to parse HTML and if so how?
Take a look at the DOMParser interface API
document.getElementById('my-target').append(new DOMParser().parseFromString(data, 'text/html').body.children);
It's not clear whether you want to render the html as html or as text.
I seem to remember that lit-html does some things behind the scenes to produce secure templates but surprisingly I cannot find official content to back up that statement. Others have asked about this before.
In that GitHub issue we can see that Mike Samuel mentions that what you're doing is not secure.You can trust Mike Samuel: he's worked in the security field at Google and I had the privilege to listen to one of his talks on the topic two years ago.
Here's a quick example:
<p .innerHTML=${'<button onclick="alert(42)">click</button>'}></p>
This renders a button which produces an alert when you click on it. In this example the JavaScript code is harmless but it's not hard to imagine something way more dangerous.
However this simply renders the code as string. The content is somehow escaped and therefore totally harmless.
<p>${'<button onclick="alert(42)">click</button>'}></p>
In fact similar to React's dangerouslySetInnerHTML attribute you need to "opt out" from secure templating via lit-html unsafeHTML directive:
Renders the argument as HTML, rather than text.
Note, this is unsafe to use with any user-provided input that hasn't been
sanitized or escaped, as it may lead to cross-site-scripting vulnerabilities.
<p>${unsafeHTML('<button onclick="alert(42)">click</button>')}></p>
About DOMParser#parseFromString
In this introductory article about trusted-types we can see that this method is a known XSS sink.
Sure it won't execute <script> blocks but it won't sanitise the string for you. You are still at risk of XSS here:
<p .innerHTML="${(new DOMParser()).parseFromString('<button onclick="alert(42)">click</button>','text/html').body.innerHTML}"></p>
I have an HTML file with some Javascript and css applied on.
I would like to duplicate that file, make like file1.html, file2.html, file3.html,...
All of that using Javascript, Jquery or something like that !
The idea is to create a different page (from that kind of template) that will be printed afterwards with different data in it (from a XML file).
I hope it is possible !
Feel free to ask more precision if you want !
Thank you all by advance
Note: I do not want to copy the content only but the entire file.
Edit: I Know I should use server-side language, I just don't have the option ):
There are a couple ways you could go about implementing something similar to what you are describing. Which implementation you should use would depend on exactly what your goals are.
First of all, I would recommend some sort of template system such as VueJS, AngularJS or React. However, given that you say you don't have the option of using a server side language, I suspect you won't have the option to implement one of these systems.
My next suggestion, would be to build your own 'templating system'. A simple implementation that may suit your needs could be something mirroring the following:
In your primary file (root file) which you want to route or copy the other files through to, you could use JS to include the correct HTML files. For example, you could have JS conditionally load a file depending on certain circumstances by putting something like the following after a conditional statement:
Note that while doing this could optimize your server's data usage (as it would only serve required files and not everything all the time), it would also probably increase loading times. Your site would need to wait for the additional HTTP request to come through and for whatever requested content to load & render on the client. While this sounds very slow it has the potential of not being that bad if you don't have too many discrete requests, and none of your code is unusually large or computationally expensive.
If using vanilla JS, the following snippet will illustrate the above:
In a script that comes loaded with your routing file:
function read(text) {
var xhr=new XMLHttpRequest;
xhr.open('GET',text);
xhr.onload=show;
xhr.send();
}
function show() {
var text = this.response;
document.body.innerHTML = text;//you can replace document.body with whatever element you want to wrap your imported HTML
}
read(path/to/file/on/server);
Note a couple of things about the above code. If you are testing on your computer (ie opening your html file on a browser, with a path like file://__) without a local server, you will get some sort of cross origin request error when trying to make an XML request. To bypass this error, either test your code on an actual server (not ideal constantly pushing code, I know) or, preferably, set up a local testing server. If this is something you would want to explore, its not that difficult to do, let me know and I'd be happy to walk you through the process.
Alternately, you could implement the above loading system with jQuery and the .load() function. http://api.jquery.com/load/
If none of the above solutions work for you, let me know more specifically what it is that you need, and I'll be happy to give a more useful/ relevant answer!
I have a website that is only accessible via https.
It does not load any content from other sources. So all content is on the local webserver.
Using the Retire.js Chrome plugin I get a warning that the jquery 1.8.3 I included is vulnerable to 'Selector interpreted as HTML'
(jQuery bug 11290)
I am trying to motivate for a quick upgrade, but I need something more concrete information to motivate the upgrade to the powers that be.
My question are :
Given the above, should I be worried ?
Can this result in a XSS type attack ?
What the bug is telling you is that jQuery may mis-identify a selector containing a < as being an HTML fragment instead, and try to parse and create the relevant elements.
So the vulnerability, such as it is, is that a cleverly-crafted selector, if then passed into jQuery, could define a script tag that then executes arbitrary script code in the context of the page, potentially taking private information from the page and sending it to someone with malicious (or merely prurient) intent.
This is largely only useful if User A can write a selector that will later be given to jQuery in User B's session, letting User A steal information from User B's page. (It really doesn't matter if a user can "tricky" jQuery this way on their own page; they can do far worse things from the console, or with "save as".)
So: If nothing in your code lets users provide selectors that will be saved and then retrieved by other users and passed to jQuery, I wouldn't be all that worried. If it does (with or without the fix to the bug), I'd examine those selector strings really carefully. I say "with or without the bug" because if you didn't filter what the users typed at all, they could still just provide an HTML fragment where the first non-whitespace character is <, which would still cause jQuery to parse it as an HTML fragment.
As the author of Retire.js let me shed some light on this. There are two weaknesses in older versions of jQuery, but those are not vulnerabilities by themselves. It depends on how jQuery is used. Two examples abusing the bugs are shown here: research.insecurelabs.org/jquery/test/
The two examples are:
$("#<img src=x onerror=...>")
and
$("element[attribute='<img src=x onerror=...>'")
Typically this becomes a problem if you do something like:
$(location.hash)
This was a fairly common pattern for many web sites, when single page web sites started to occur.
So this becomes a problem if and only if you put untrusted user data inside the jQuery selector function.
And yes the end result is XSS, if the site is in fact vulnerable. https will not protect you against these kinds of flaws.
I have a form that people can add their stuff. However, in that form, if they enter JavaScript instead of only text, they can easily inject whatever they want to do. In order to prevent it, I can set escapeXml to true, but then normal HTML would be escaped as well.
<td><c:out value="${item.textValue}" escapeXml="true" /></td>
Is there any other way to prevent JavaScript injection rather than setting this to true?
I'd recommend using Jsoup for this. Here's an extract of relevance from its site.
Sanitize untrusted HTML
Problem
You want to allow untrusted users to supply HTML for output on your website (e.g. as comment submission). You need to clean this HTML to avoid cross-site scripting (XSS) attacks.
Solution
Use the jsoup HTML Cleaner with a configuration specified by a Whitelist.
String unsafe =
"<p><a href='http://example.com/' onclick='stealCookies()'>Link</a></p>";
String safe = Jsoup.clean(unsafe, Whitelist.basic());
// now: <p>Link</p>
So, all you basically need to do is the the following during processing the submitted text:
String text = request.getParameter("text");
String safe = Jsoup.clean(text, Whitelist.basic());
// Persist 'safe' in DB instead.
Jsoup offers more advantages than that as well. See also Pros and Cons of HTML parsers in Java.
You need to parse the HTML text on the server as XML, then throw out any tags and attributes that aren't in a strict whitelist.
(And check the URLs in href and src attributes)
This is exactly the intent of the OWASP AntiSamy project.
The OWASP AntiSamy project is a few things. Technically, it is an API for ensuring user-supplied HTML/CSS is in compliance within an application's rules. Another way of saying that could be: It's an API that helps you make sure that clients don't supply malicious cargo code in the HTML they supply for their profile, comments, etc., that get persisted on the server. The term "malicious code" in regards to web applications usually mean "JavaScript." Cascading Stylesheets are only considered malicious when they invoke the JavaScript engine. However, there are many situations where "normal" HTML and CSS can be used in a malicious manner. So we take care of that too.
Another alternative is the OWASP HTMLSanitizer project. It is faster, has less dependencies and actively supported by the project lead as of now. I don’t think it has gone through any GA/Stable release yet so you should consider that when evaluating this library.
https://urbantastic-blog.tumblr.com/post/81336210/tech-tuesday-the-fiddly-bits/amp
Heath from Urbantastic writes about his HTML generation system:
All the HTML in Urbantastic is completely static. All dynamic data is sent via AJAX in JSON format and then combined with the HTML using Javascript. Put another way, the server software for Urbantastic produces and consumes JSON exclusively. HTML, CSS, Javascript, and images are all sent via a different service (a vanilla Nginx server).
I think this is an interesting model as it separates presentation from data physically. I am not an expert in architecture but it seems like there would be a jump in efficiency and stability.
However, the following concerns me:
[subjective] Clojure is extremely powerful; Javascript is not. Writing all the content generation on a language created for another goals will create some pain (imagine writing Javascript-type code in CSS). Unless he has a macro-system for generating Javascript, Heath is probably up to constant switching between JavaScript and Clojure. He'll also have a lot of JS code; probably a lot more than Clojure. That might not be good in terms of power, rapid development, succinctness and all the things we are looking at when switching to LISP-based langauges.
[performance] I am not sure on this but rendering everything on user's machine might lag.
[accessibility] If you have JS disabled you can't use site at all.
[accessibility#2] i suspect that a lot of dynamic data filling with JavaScript will create cross-browser issues.
Can anyone comment? I'd be interested in reading your opinions on this type of architecture.
References:
Link to discussion on HN.
Link to discussion on /r/programming.
"All the HTML in Urbantastic is completely static. All dynamic data is sent via AJAX in JSON format and then combined with the HTML using Javascript."
I think that's the standard model of an RIA. The emphasis word seems to be 'All' here. Cause in many websites a lot of the dynamic content is still not obtained through Ajax, only key features are.
I don't think the rendering issues would be a major bottleneck if you don't have a huge webpage with a lot of elements.
JS accessibility is indeed a problem. But then, users who want to experience AJAX must have JS enabled. Have you done a survey on how many of YOUR users don't have it enabled?
The advantage is, you can serve 99% (by weight) of the content through CDN (like Akamai) or even put it on external storage (eg. S3). Serving only the JSON it's almost impossible for a site to get slashdoted.
When AJAX began to hit it big, late 2005 I wrote a client-side template engine and basically turned my blogger template into a fully fledged AJAX experience.
The thing is, that template stuff, it was really easy to implement and it eliminated a lot of the grunt work.
Here's how it's was done.
<div id="blogger-post-template">
<h1><span id="blogger-post-header"/></h1>
<p><span id="blogger-post-body"/><p>
<div>
And then in JavaScript:
var response = // <- AJAX response
var container = document.getElementById("blogger-post-template");
if (!template) { // template context
template = container.cloneNode(true); // deep clone
}
// clear container
while(container.firstChild)
container.removeChild(template.firstChild);
container.appendChild(instantiate(template, response));
The instantiate function makes a deep clone of the template then searches the clone for identifiers to replace with data found in the response. The end result is a populated DOM tree which was originally defined in HTML. If I had more than one result I just looped through the above code.