tinymce storing generated markup in database security concerns

tinymce storing generated markup in database security concerns - javascript

I have used tinymce as a part of a forum feature on my site. I am taking the innerHTML of the textarea and storing it inside a SQL database.
I retrieve the markup when viewing thread posts.
Is there any security concerns doing what I am doing? Does tinymce have any inbuilt features to stop malicious content / markup being added and therefore saved?

TinyMce does a pretty good job ant content scrubbing and input cleanup (on the client side). Being a very popular web rich text editor, the creators have spent a lot of work on making it fairly secure in terms of preventing simple copy and paste of malicious content in to the editor. You can do things like enable/disable cleanup, specify what tags/attributes/characters are allowed, etc.
See the TinyMce Configurations Page. Options of note include: valid_elements, invalid_elements, verify_html, valid_styles, invalid_styles, and extended_valid_elements
Also: instead of grabbing the innerHtml of the textarea, you should probably use tinyMce's getContent() function. see: getContent()
BUT this is all client-side javascript!
Although these featuers are nice, all of this cleanup still happens on the client. So conceivably, the client JS could be modified to stop escaping/removing malicious content. Or someone could POST bad data to your request handlers without ever going thru the browser (using curl, or any other number of tools).
So tinyMce provides a nice baseline of client scrubbing, however to be secure, the server should assume that anything it is being sent is dirty and should thus treat all content with caution.
Things that can be done by the server:
Even if you implement the most sophisticated client-side validation/scrubbing/prevention, that is worthless as far as your backend's security is concerned. An excellent reference for preventing malicious data injections can be found on the OWASP Cross Site Scripting Prevention Cheat Sheet and the OWASP SQL Injection Prevention Cheat Sheet. Not only do you have to protect against SQL injection type attacks, but also XSS attacks if any user submitted data will be displayed on the website for other unsuspecting users to view.
In addition to sanitizing user input data on the server, you can also try things such as mod_security to squash requests that contain certain patterns that are indicative of malicious requests. You can also enforce max length of inputs on both the client and server side, as well as adding a max request size for your server to make sure someone doesn't try and send a GB of data. How to set a max request size will vary from server to server. Violations of max request size should result in a HTTP 413/Request Entity Too Large

Further to #jCuga's excellent answer, you should implement a Content Security Policy on any pages where you output the rich text.
This allows you to effectively stop inline script from being executed by the browser. It is currently supported by modern browsers such as Chrome and Firefox.
This is done by a HTTP response header from your page.
e.g.
Content-Security-Policy: script-src 'self' https://apis.google.com
will stop inline JavaScript from being executed if a user managed to inject it into your page (it will be ignored with a warning), but will allow script tags referencing either your own server or https://apis.google.com. This can be customised to your needs as required.
Even if you use a HTML sanitizer to strip any malicious tags, it is a good idea to use this in combination with a CSP just in case anything slips through the net.

Related

Proper way to inject javascript code?

I would like to create a site with a similar functionality like translate.google.com or hypothes.is has: users can enter any address and the site opening with an additional menu. I gues this is done with some middleware-proxy solution and a javascript is injected in the response, but I'm not sure. Do you have any idea how to implement the same feature? How can it work with secured (https) sites?
Many Thanks

The entire site is fetched by the server, the source code is parsed, code injected and then sent back to the requesting client.
It works with SSL just fine, because it's two separate requests - the request that gets sent to the endpoint is not seen by the user.
The certificate is valid because it's being served under google's domain.
In order to actually implement something like this could potentially be quite complicated, because:
The HTML you are parsing won't necessarily conform to your expectations, or even be valid
The content you're forwarding to the client will likely reference resources with a relative URI. This means that you also need to intercept these requests and pull the resources (images, external css, js, etc) and serve them back to the client - and also rewrite the URLs.
It's very easy to break content by injecting arbitrary javascript. You need to be careful that your injected code is contained and won't interfere with any existing code on the site.
It's very common for an implementation such as this to have non-obvious security concerns, often resulting in XSS attacks being possible.

XSS still possible in modern browsers

I was curious, whether XSS is still possible today. I read a lot about browsers preventing it, but I seem I have missed something.
I tried a couple approaches myself, including the simplest ways, AJAX calls (luckily blocked by the browser) and viewing the content of an <iframe> and <frameset>, no success either way.
I read about DOM XSS, but that will only work, if the host has a page where it echoes content from the URL parameters.
Question:
Are modern browsers safe or are there any reasons why I should logout of every service I use before leaving a page?

whether XSS is still possible today.
Yes, it is.
will only work, if the host has a page where it echoes content from the URL parameters.
XSS is possible when any user input is output (either immediately (for a reflected attack) or later, possible to a different person (for a stored attack). That is what XSS is.
The Same Origin Policy (and related security features that prevent access to content on a different origin) has nothing to do with XSS.
Are modern browsers safe
XSS is a vulnerability in code provided by the server that takes user input and does something with it. There is no way to tell if user input is an XSS attack or a legitimate submission of data that includes live code. It has to be dealt with by server provided code since the input has to be treated with context sensitivity.

Scraping a remote URL for bookmarking service without getting blocked

I'm using a server-side node.js function to get the text of a URL passed by the browser, to auto-index that url in a bookmarking service. I use jsdom for server-side rendering. BUT I get blocked from popular sites, despite the requests originating from legitimate users.
Is there a way to implement the URL text extraction on the browser side, such that requests would always seem to be coming from a normal distribution of users? How do I get around the cross-site security limitations in the browser? I only need the final DOM-rendered text.
Is a bookmarklet the best solution? When the user wants to bookmark the page, I just append a form in a bookmarklet and submit the DOM-rendered text in my bookmarklet?
I know SO hates debates, but any guidance on good methods would be much appreciated.

You could certainly do it client-side but I think that would be overly complex. The client would have to send the html to your service & that would require very careful sanitising & might be difficult to control the volume of incoming data.
I would probably simply track the request domains and ensure that I limited the frequency of calls to any single domain. That should be fairly straight forward if using something like Node.JS where you could easily set up any number of background fetch tasks. This would also allow you to fine tune the bandwidth used.

Prevent HTML form action from being changed

I have a form on my page where users enter their credit card data. Is it possible in HTML to mark the form's action being constant to prevent malicious JavaScript from changing the form's action property? I can imagine an XSS attack which changes the form URL to make users posting their secret data to the attacker's site.
Is it possible? Or, is there a different feature in web browsers which prevents these kinds of attacks from happening?

This kind of attack is possible, but this is the wrong way to prevent against it. If a hacker can change the details of the form, they can just as easily send the secret data via an AJAX GET without submitting the form at all. The correct way to prevent an XSS attack is to be sure to encode all untrusted content on the page such that a hacker doesn't have the ability to execute their own JavaScript in the first place.
More on encoding...
Sample code on StackOverflow is a great example of encoding. Imagine what a mess it would be if every time someone posted some example JavaScript, it actually got executed in the browser. E.g.,
<script type="text/javascript">alert('foo');</script>
Were it not for the fact that SO encoded the above snippet, you would have just seen an alert box. This is of course a rather innocuous script - I could have coded some JavaScript that hijacked your session cookie and sent it to evil.com/hacked-sessions. Fortunately, however, SO doesn't assume that everyone is well intentioned, and actually encodes the content. If you were to view source, for example, you would see that SO has encoded my perfectly valid HTML and JavaScript into this:
<script type="text/javascript">alert('foo');</script>
So, rather than embedding actual < and > characters where I used them, they have been replaced with their HTML-encoded equivalents (< and >), which means that my code no longer represents a script tag.
Anyway, that's the general idea behind encoding. For more info on how you should be encoding, that depends on what you're using server-side, but most all web frameworks include some sort of "out-of-the-box" HTML Encoding utility. Your responsibility is to ensure that user-provided (or otherwise untrusted) content is ALWAYS encoded before being rendered.

Is there a different feature in web browsers which
prevents these kinds of attacks from happening?
Your concern has since been addressed by newer browser releases through the new Content-Security-Policy header.
By configuring script-src, you can disallow inline javascript outright. Note that this protection will not necessarily extend to users on older browsers (see CanIUse ).
Allowing only white-labeled scripts will defeat most javascript XSS attacks, but may require significant modifications to your content. Also, blocking inline javascript may be impractical if you are using a web frameworks that relies heavily on inline javascript.

Nope nothing to really prevent it.
The only thing I would suggest to do is have some server side validation of any information coming to the server from a user form.
As the saying goes: Never trust the user

What's the point of the Anti-Cross-Domain policy?

Why did the creators of the HTML DOM and/or Javascript decide to disallow cross-domain requests?
I can see some very small security benefits of disallowing it but in the long run it seems to be an attempt at making Javascript injection attacks have less power. That is all moot anyway with JSONP, it just means that the javascript code is a tiny bit more difficult to make and you have to have server-side cooperation(though it could be your own server)

The actual cross-domain issue is huge. Suppose SuperBank.com internally sends a request to http://www.superbank.com/transfer?amount=100&to=123456 to transfer $10,000 to account number 123456. If I can get you to my website, and you are logged in at SuperBank, all I have to do is send an AJAX request to SuperBank.com to move thousands of dollars from your account to mine.
The reason JSON-P is acceptable is that it is pretty darn impossible for it to be abused. A website using JSON-P is pretty much declaring the data to be public information, since that format is too inconvenient to ever be used otherwise. But if it's unclear as to whether or not data is public information, the browser must assume that it is not.

When cross-domain scripting is allowed (or hacked by a clever Javascripter), a webpage can access data from another webpage. Example: joeblow.com could access your Gmail while you have mail.google.com open. joeblow.com could read your email, spam your contacts, spoof mail from you, delete your mail, or any number of bad things.

To clarify some of the ideas in the questions into a specific use case..
The cross domain policy is generally not there to protect you from yourself. Its to protect the users of your website from the other users of your website (XSS).
Imagine you had a website that allowed people to enter any text they want, including javascript. Some malicious user decides to add some javascript to the "about yourself" field. Users of your website would navigate his profile and have this script executed on their browser. This script, since its being executed on your website's behalf, has access to cookies and such from your website.
If the browser allowed for cross domain communication, this script could theoretically collect your info and then upload it to a server that the malicious user would own.

Here's a distinction for you: Cross-domain AJAX allows a malicious site to make your browser to things on its behalf, while JSON-P allows a malicious server to tamper with a single domain's pages (and to make the browser do things to that domain on your behalf) but (crucial bit) only if the page served went out of its way to load the malicious payload.
So yes, JSON-P has some security implications, but they are strictly opt-in on the part of the website using them. Allowing general cross-domain AJAX opens up a much larger can of worms.

Develop Reference

JavaScript is the programming language of the Web.