Input validation check - javascript

In my website I have a forum, and I want to avoid cross site scripting. Do you know a good input validation script?

There are two ways to avoid Cross Site Scripting.
Filter the inputs by the users
(mainly script tags and html tags)
both at client side as well as on
server side.
Display the contents as
Html entities to avoid Cross Site
Scripting. Ofcourse if you want to show some
of the tags, go for option one.
Otherwise option two is more
reliable.
You can use regular expressions to filter the data both on client side as well as on server side.

I've always relied on the OWASP PHP filters: http://www.owasp.org/index.php/OWASP_PHP_Filters As you can tell from the name, they're server-side (JavaScript or HTML5 validation is only useful for assisting the user) and OWASP (the Open Web Application Security Project) is a non-profit organisation.

It depends on where do you want to write out the data. For example you need different filter when you write the text into an input field and when you write it simply into the html body, between two tags.
You should implement different filters for the different data types on server side. I suggest you should filter the text when it's printed out, and not when the user sends it to you(of course it's not about sql injections and other server side tricks), because (as I mentioned above) the type of the filter you should use is depends on where the data is printed out.
If you want to write a really simple forum, then it's enough to write only one filter, wich simply removes all html tags from the text before it's printed out. Beware, it's not good for advanced functions, for example edit comments, when you prefill a form for the user, or if the users can use any html tag in their comments, etc.

Simple. Make sure you escape the HTML from your input object before using it. This way, the data sent will be treated as raw text. The way to do this will be to pass the input through some parser before embedding the data in your page (or working with it somehow).

I agree with anand that there are two major ways to avoid XSS: validation on input and escaping on output. For validating form input, tie into Django's Form Validation Framework: http://code.google.com/appengine/articles/djangoforms.html
Here are some code samples for sanitizing on output within a Django templating. Instead of this:
Welcome, {{ firstname }}!
Do this:
Welcome, {{ firstname|escape }}!
This is from this very good blog post: http://startupsecurity.info/blog/2008/10/28/avoid-xss-on-google-app-engine/

Server Side
http://www.php.net/manual/en/function.html-entity-decode.php
http://www.php.net/manual/en/function.addslashes.php
http://www.php.net/manual/en/function.stripslashes.php
Check more string functions you need to validate
Client Side
http://www.position-relative.net/creation/formValidator/
better to write your own jquery code, in future it may help you

You have two option for validation. for non-sensitive data client side JavaScript may be use.
in JavaScript, you can write simple function for validating your data.
for sensitive data, you should be use server side scripting like, php,jsp,asp,asp.net etc.
may this will help you.

Related

How to handle sanitizing in JavaScript editors that allow formatting

Many editors like Medium offers formatting now. From what I see in the DOM it simply adds HTML. But how do you sanitize this kind of input without losing the formatting applied by the user?
E.g. clicking bold adds:
<strong class="markup--strong markup--p-strong">text</strong>
but you wouldn't want to render if the user enters that by themselves. So how's that different? Also would that be different if you would style with markdown but also don't let users enter their own markdown but make it only accessible through the browser?
One way I could think of is, escaping every HTML special character, but that seems odd. As far as I know you sanitizer the content only when outputting it
You shold use a server side sanitizer, as stated by Vipin as client side validation is prone to be tampered.
OWASP (Open Web Application Security Project) has some guides and sanitizers that you may use like the java-html-sanitizer.
For a generic brief on the concept please read this https://www.owasp.org/index.php/Data_Validation under the section Sanitize.
You could replace the white-listed elements with other character, for example:
<strong.*> becomes |strong|
Then you remove ALL other HTML. Be aware of onmouseover="alert(1)" so keep it really simple.
Also be careful when rendering the user input. Don't just add it as code. Instead parse it and create the elements using JavaScript. Never use innerHTML, but do use .innerText and document.createElement().

Allowing basic HTML in posts (inc. line breaks, no-follow links etc.) while maintaining security - CakePHP

In my CakePHP blog, I want to enable users to make similar HTML additions as you can insert here on StackOverflow, i.e. line breaks, links, bold, lists etc. But I am a little unsure how I shall tackle this issue in terms of what is most practical whilst maintaining protection against malicious code in the posts users submit.
Practically is it the most convenient to save the post in a TEXT database field and allow some HTML in that?
If I allow some HTML code in the post, how do I ensure that I only allow non-malicious basic HTML code whilst cleaning out the rest?
Should I be using the CakePHP Sanitize class for that somehow?
Will the FormHelper clean out all HTML users input?
I assume I'll have to use JavaScript to help users generate the right code?
If it's not for developers, have you considered a WYSIWYG addon like TinyMCE?
http://www.tinymce.com/
http://bakery.cakephp.org/articles/galitul/2012/04/11/helper_tinymce_for_cakephp_2
As for security, whitelisting is the safest method. Blacklisting should be avoided because there's no way you can handle all the tricks that can be used to bypass them (e.g. passing in text via hex, etc).
TinyMCE lets you specify a whitelist:
http://www.tinymce.com/wiki.php/Configuration:valid_elements
Use a whitelist for what HTML tags you allow. First HTML encode everything, then decode the specific tags that you allow.
A basic example:
function encodeForOutput(s) {
s = s.replace(/</g, '<').replace(/>/g, '>').replace(/"/g, '"').replace(/&/g, '&');
// allow <b>
s = s.replace(/<b>(.*?)</b>/$, '$1');
return s;
}

How to validate html file without javascript?

If the browser disabled the javascript , then how to validate the html file ? I am not talking about 'how to enable javascript, but how to validate the html form without using javascript ? This question is asked by an interviewer.
Form validation must always be performed server-side. Client side validation is great, but optional. Surely this is what the interviewer was looking for.
You can refer one of the example from How to validate html for input without using javascript.
Javascript is the only way to validate the data on the client side
before it is submitted. Since Javascript can be disabled, your
server-side code should always validate any submitted data before
using it.
Not sure what 'validate' means in that context (might be a good chance for you to ask a follow up question on what he/she means on html validation?)
Anyway, I am assuming that they are talking about HTML markup validation then there are online services like:
http://validator.w3.org/
If the interviewer is talking about validation in terms of client-side validation (eg user age must contain range between X to Yetc), then you might need to validate the input on the server (eg with server side code).

What JavaScript library to use for client-side form checking?

Over the years, I've dabbled in plain JavaScript a little, but have not yet used any JavaScript/AJAX libraries. For some new stuff I'm working on, I would like to use a js library to do client-side form validation, and want to know which would be best for that. By best, my criteria would be: quick and easy to learn, small footprint, compatible with all popular browsers.
Edit: Thanks for the ASP suggestions, but they're not relevant to me. Sorry I didn't originally mention it, but the server is a Linux box running Apache and PHP. As I know I should, I plan to do server side validation of the input, but want the client side validation to improve the users' experience and to avoid as much as possible having the server reject invalid inputs.
Edit 2: Sorry I didn't reply for months! Other priorities came up and diverted me from this. I ended up doing my own validation routines - in addition to the good points made in some of the answers, some of the items I'm validating are rarely used in other applications and I couldn't find a library with that sort of validation included.
You could use jQuery and it's Validation plugin.
I don't use libraries myself, but dived into some (like prototype, (yui-)ext, the seemingly omnipresent jquery, mootools) to learn from them and extract some of the functions or patterns they offer. The libraries (aka 'frameworks') contain a lot of functionallity I never need, so I wrote my own subset of functions. Form checking is pretty difficult to standardize (except perhaps for things like phone numbers or e-mail address fields), so I don't think a framework would help there either. My advice would be to check if one of the libraries offer the functionallity you look for, and/or use/rewrite/copy the functions you can use from them. For most open source libraries it is possible to download the uncompressed source.
It needs to be said (by the way and perhaps well known allready) that client side form checking is regarded insufficient. You'll have to check the input server side too.
Before AJAX Libraries I used Validation.JS by Matthew "Matt" Frank.
The basic idea is that you include a JS file and then add attributes to your INPUT statement.
Example:
<input name="start-date" type="text"
display-name="Start Date" date="MM/YYYY" required="#getRequired()" />
Field will be validated as a date in MM/YYYY style. Any error message displayed will refer to the field as "Start Date". The "#" prefix will cause the getRequired() function to be evaluated at run-time.
A variety of things are provided as standard (Currency, Date, Phone, ZIP, Min/Max value, Max length, etc), and there is a keystroke filter; alternatively you can roll your own - most easily by just defining a Regular Expression for the field, but you can add Javascript Functions to be called to make the validation.
There are pseudo events for handlers to catch before/after field and form.
In additional to Attributes in the INPUT statement, validation actions can be applied to the field by JS:
// Set field background when in error state
document.MyForm["INVALID-COLOR"]="yellow";
// Show error messages on field blur
document.MyForm["SUPPRESS-ONCHANGE-MESSAGE"]=true;
document.MyForm.MyField.REQUIRED = true;
document.MyForm.MyField.DisplayName="Password";
Validation.JS is 28K (uncompressed)
I've had a bit of a trawl around to try to find an HTML file you can easily get to with details, but I can't fine one standalone that I can link to.
The source code is here:
http://code.google.com/p/javascript-form-validation/source/browse/#svn/trunk
and the DOCs are in the HTML files - but you can't view those as HTML, you have to download them and then view them, as far as I can make out
I do most new stuff in ASP.NET with AJAX, so I use the ASP.NET validators with the AJAX extenders, and they work great. However, if you are not into ASP.NET this isn't going to help you.
Most major JavaScript frameworks (jQuery, YUI, Prototype, etc) have validation capabilities, so you could consider them. But depending on your needs, you might regard it as overkill.
Previously (in ASP Classic) I used my own validation script which was only 6KB; I obviously don't now because I like the consistency and polish offered by these frameworks, but YMMV.

Where do you record validation rules for form data in a web application?

Say you have a web form with some fields that you want to validate to be only some subset of alphanumeric, a minimum or maximum length etc.
You can validate in the client with javascript, you can post the data back to the server and report back to the user, either via ajax or not. You could have the validation rules in the database and push back error messages to the user that way.
Or any combination of all of the above.
If you want a single place to keep validation rules for web application user data that persist to a database, what are some best practices, patterns or general good advice for doing so?
[edit]
I have edited the question title to better reflect my actual question! Some great answers so far btw.
all of the above:
client-side validation is more convenient for the user
but you can't trust the client so the code-behind should also validate
similarly the database can't trust that you validated so validate there too
EDIT: i see that you've edited the question to ask for a single point of specification for validation rules. This is called a "Data Dictionary" (DD), and is a Good Thing to have and use to generate validation rules in the different layers. Most systems don't, however, so don't feel bad if you never get around to building such a thing ;-)
One possible/simple design for a DD for a modern 3-tier system might just include - along with the usual max-size/min-size/datatype info - a Javascript expression/function field, C# assembly/class/method field(s), and a sql expression field. The javascript could be slotted into the client-side validation, the C# info could be used for a reflection load/call for server-side validation, and the sql expression could be used for database validation.
But while this is all good in theory, I don't know of any real-world systems that actually do this in practice [though it "sounds" like a Good Idea].
If you do, let us know how it goes!
To answer the actual question:
First of all it isn't allways the case that the databse restriction matches client side restrictions. So it would probably be a bad idea to limit yourself to only validate based on database constraints.
But then again, you do want the databse constraints to be reflected in your datamodel. So a first approximation would probably be to defins some small set of perdicates that is mappeble to both check constraints, system language and javascript.
Either that or you just take care to kepp the three representations in the same place so you remember to keep them in sync when changeing something.
But suppose you do want another set of restrictions used in a particular context where the domain model isn't restrictive enough, or maybe it's data that isn't in the model at all. It would probably be a good idea if you could use the same framwork used to defin model restriction to define other kinds of restrictions.
Maybe the way to go is to define a small managable DSL for perdicates describing the restriction. Then you produce "compilers" that parses this DSL and provides the representation you need.
The "DSL" doesn't have to be that fancy, simple string and int validation isn't that much of a problem. RegEx validation could be a problem if your database doesn't support it though. You can probably design this DSL as just a set of classes or what your system language provides that can be combined into expressions with simple boolean algebra.
To keep validation rules in one place I use only server-side validation. To make it more user-friendly I just make an asynchronous post request to the server, and server returns error informations in JSON format, like:
{ "fieldName1" : "error description",
"fieldName2" : "another error description" };
Form is being submitted if the server returned an empty object, otherwise I can use information from the server to display errors. It works much like these sign-up forms that check if your login is taken before you even submit the form, with two key differences: request is being sent onsubmit, and sends all field values (except input type="file").
If JavaScript validation didn't work for any reason, regular server-side validation scenario (page reload with error informations) takes place, using the same server-side script.
This solution isn't as responsive as pure client-side validation (needs time to send/receive data between client and server), but is quite simple, and you don't need to "translate" validation rules to JavaScript.
Always validate every input server-side. You don't know that their client supports javascript "properly", or that they aren't spoofing their http requests and bypassing your javascript entirely.
I'd suggest not limiting your checks to one location though - additional checks within the javascript make things more responsive for your users.
As others have said, you need to validate on the server side for security and data-integrity reasons, and validating on the client-side will improve the user experience, since users will have a chance to fix their mistakes sooner.
It seemed the question was asking more about how do you define the validations so every place that validates is in sync. I would recommend defining your validation rules in one place, such as an XML file, or something, and having a framework that reads that file, and generates javascript functions to validate on the client. It can then use the same rules to validate on the server.
This way, if you ever need to change a rule, you have one place to go.
As noted by others you have to do validation at the Database, Client, and Server tiers. You were asking for a single place to hold validation so all of these are in sync.
One approach used by several web development frameworks (including CakePHP )
Is to organize your code into Model, View, Controller objects.
You would put all your data code including validation in the model layer (comments for database table structure or stored procedures if needed).
Next, In this Model define a regular expression for each field for your validation (along with generic max-size, min-size, required fields).
Finally, use this regular expression in to validate in javascript (view) and in on the server form processing code (Controller).
If the regex isn't sufficient - ie you have to check the database to see if a username is available, you can define a validation method on the model use it directly in your form processing code and then call it from javascript using ajax and a little validation page.
I'll put in a plug for starting with a good framework so you don't have to wire all this up yourself.
Client-side validation for good, responsive user interfaces
Server-side validation because client-side code can be bypassed or modified and so can't be trusted
Database validation if you have multiple apps feeding into one db. It's important here as then a change to validation is automatically propagated to all apps and you don't lose data consistency.
We try to keep get our validation done before it ever hits the database server, especially for our applications which are facing the public internet. If you don't do validation before the data hits the database, you put your database at risk for SQL-injection attacks. We validation through a mixture of javascript and code-behinds.
A good data validation solution could make use of XML Schema based datatypes definition, then both client and server would reuse the types as they would both need to executing it. Worth noting, Backbase Ajax Framework implement client-side user input validation based on XML Schema data types (built-in and user-defined ones)
In the past, I've used XSLT for validation. We'd create an XML doc of the values and run it against XSLT. The XSLT was built of XPath "rules." The resulting XML doc was composed of a list of broken rules and the fields that broke them.
We were able to:
store the rules in a relational DB
generate the XSLT from the DB
use the XSLT on the client
use the XSLT on the server
use the raw rules in the DB

Categories

Resources