This question already has answers here:
How can I sanitize user input with PHP?
(16 answers)
Closed 6 months ago.
I am trying to come up with a function that I can pass all my strings through to sanitize. So that the string that comes out of it will be safe for database insertion. But there are so many filtering functions out there I am not sure which ones I should use/need.
Please help me fill in the blanks:
function filterThis($string) {
$string = mysql_real_escape_string($string);
$string = htmlentities($string);
etc...
return $string;
}
Stop!
You're making a mistake here. Oh, no, you've picked the right PHP functions to make your data a bit safer. That's fine. Your mistake is in the order of operations, and how and where to use these functions.
It's important to understand the difference between sanitizing and validating user data, escaping data for storage, and escaping data for presentation.
Sanitizing and Validating User Data
When users submit data, you need to make sure that they've provided something you expect.
Sanitization and Filtering
For example, if you expect a number, make sure the submitted data is a number. You can also cast user data into other types. Everything submitted is initially treated like a string, so forcing known-numeric data into being an integer or float makes sanitization fast and painless.
What about free-form text fields and textareas? You need to make sure that there's nothing unexpected in those fields. Mainly, you need to make sure that fields that should not have any HTML content do not actually contain HTML. There are two ways you can deal with this problem.
First, you can try escaping HTML input with htmlspecialchars. You should not use htmlentities to neutralize HTML, as it will also perform encoding of accented and other characters that it thinks also need to be encoded.
Second, you can try removing any possible HTML. strip_tags is quick and easy, but also sloppy. HTML Purifier does a much more thorough job of both stripping out all HTML and also allowing a selective whitelist of tags and attributes through.
Modern PHP versions ship with the filter extension, which provides a comprehensive way to sanitize user input.
Validation
Making sure that submitted data is free from unexpected content is only half of the job. You also need to try and make sure that the data submitted contains values you can actually work with.
If you're expecting a number between 1 and 10, you need to check that value. If you're using one of those new fancy HTML5-era numeric inputs with a spinner and steps, make sure that the submitted data is in line with the step.
If that data came from what should be a drop-down menu, make sure that the submitted value is one that appeared in the menu.
What about text inputs that fulfill other needs? For example, date inputs should be validated through strtotime or the DateTime class. The given date should be between the ranges you expect. What about email addresses? The previously mentioned filter extension can check that an address is well-formed, though I'm a fan of the is_email library.
The same is true for all other form controls. Have radio buttons? Validate against the list. Have checkboxes? Validate against the list. Have a file upload? Make sure the file is of an expected type, and treat the filename like unfiltered user data.
Every modern browser comes with a complete set of developer tools built right in, which makes it trivial for anyone to manipulate your form. Your code should assume that the user has completely removed all client-side restrictions on form content!
Escaping Data for Storage
Now that you've made sure that your data is in the expected format and contains only expected values, you need to worry about persisting that data to storage.
Every single data storage mechanism has a specific way to make sure data is properly escaped and encoded. If you're building SQL, then the accepted way to pass data in queries is through prepared statements with placeholders.
One of the better ways to work with most SQL databases in PHP is the PDO extension. It follows the common pattern of preparing a statement, binding variables to the statement, then sending the statement and variables to the server. If you haven't worked with PDO before here's a pretty good MySQL-oriented tutorial.
Some SQL databases have their own specialty extensions in PHP, including SQL Server, PostgreSQL and SQLite 3. Each of those extensions has prepared statement support that operates in the same prepare-bind-execute fashion as PDO. Sometimes you may need to use these extensions instead of PDO to support non-standard features or behavior.
MySQL also has its own PHP extensions. Two of them, in fact. You only want to ever use the one called mysqli. The old "mysql" extension has been deprecated and is not safe or sane to use in the modern era.
I'm personally not a fan of mysqli. The way it performs variable binding on prepared statements is inflexible and can be a pain to use. When in doubt, use PDO instead.
If you are not using an SQL database to store your data, check the documentation for the database interface you're using to determine how to safely pass data through it.
When possible, make sure that your database stores your data in an appropriate format. Store numbers in numeric fields. Store dates in date fields. Store money in a decimal field, not a floating point field. Review the documentation provided by your database on how to properly store different data types.
Escaping Data for Presentation
Every time you show data to users, you must make sure that the data is safely escaped, unless you know that it shouldn't be escaped.
When emitting HTML, you should almost always pass any data that was originally user-supplied through htmlspecialchars. In fact, the only time you shouldn't do this is when you know that the user provided HTML, and that you know that it's already been sanitized it using a whitelist.
Sometimes you need to generate some Javascript using PHP. Javascript does not have the same escaping rules as HTML! A safe way to provide user-supplied values to Javascript via PHP is through json_encode.
And More
There are many more nuances to data validation.
For example, character set encoding can be a huge trap. Your application should follow the practices outlined in "UTF-8 all the way through". There are hypothetical attacks that can occur when you treat string data as the wrong character set.
Earlier I mentioned browser debug tools. These tools can also be used to manipulate cookie data. Cookies should be treated as untrusted user input.
Data validation and escaping are only one aspect of web application security. You should make yourself aware of web application attack methodologies so that you can build defenses against them.
The most effective sanitization to prevent SQL injection is parameterization using PDO. Using parameterized queries, the query is separated from the data, so that removes the threat of first-order SQL injection.
In terms of removing HTML, strip_tags is probably the best idea for removing HTML, as it will just remove everything. htmlentities does what it sounds like, so that works, too. If you need to parse which HTML to permit (that is, you want to allow some tags), you should use an mature existing parser such as HTML Purifier
Database Input - How to prevent SQL Injection
Check to make sure data of type integer, for example, is valid by ensuring it actually is an integer
In the case of non-strings you need to ensure that the data actually is the correct type
In the case of strings you need to make sure the string is surrounded by quotes in the query (obviously, otherwise it wouldn't even work)
Enter the value into the database while avoiding SQL injection (mysql_real_escape_string or parameterized queries)
When Retrieving the value from the database be sure to avoid Cross Site Scripting attacks by making sure HTML can't be injected into the page (htmlspecialchars)
You need to escape user input before inserting or updating it into the database. Here is an older way to do it. You would want to use parameterized queries now (probably from the PDO class).
$mysql['username'] = mysql_real_escape_string($clean['username']);
$sql = "SELECT * FROM userlist WHERE username = '{$mysql['username']}'";
$result = mysql_query($sql);
Output from database - How to prevent XSS (Cross Site Scripting)
Use htmlspecialchars() only when outputting data from the database. The same applies for HTML Purifier. Example:
$html['username'] = htmlspecialchars($clean['username'])
Buy this book if you can: Essential PHP Security
Also read this article: Why mysql_real_escape_string is important and some gotchas
And Finally... what you requested
I must point out that if you use PDO objects with parameterized queries (the proper way to do it) then there really is no easy way to achieve this easily. But if you use the old 'mysql' way then this is what you would need.
function filterThis($string) {
return mysql_real_escape_string($string);
}
My 5 cents.
Nobody here understands the way mysql_real_escape_string works. This function do not filter or "sanitize" anything.
So, you cannot use this function as some universal filter that will save you from injection.
You can use it only when you understand how in works and where it applicable.
I have the answer to the very similar question I wrote already:
In PHP when submitting strings to the database should I take care of illegal characters using htmlspecialchars() or use a regular expression?
Please click for the full explanation for the database side safety.
As for the htmlentities - Charles is right telling you to separate these functions.
Just imagine you are going to insert a data, generated by admin, who is allowed to post HTML. your function will spoil it.
Though I'd advise against htmlentities. This function become obsoleted long time ago. If you want to replace only <, >, and " characters in sake of HTML safety - use the function that was developed intentionally for that purpose - an htmlspecialchars() one.
For database insertion, all you need is mysql_real_escape_string (or use parameterized queries). You generally don't want to alter data before saving it, which is what would happen if you used htmlentities. That would lead to a garbled mess later on when you ran it through htmlentities again to display it somewhere on a webpage.
Use htmlentities when you are displaying the data on a webpage somewhere.
Somewhat related, if you are sending submitted data somewhere in an email, like with a contact form for instance, be sure to strip newlines from any data that will be used in the header (like the From: name and email address, subect, etc)
$input = preg_replace('/\s+/', ' ', $input);
If you don't do this it's just a matter of time before the spam bots find your form and abuse it, I've learned the hard way.
It depends on the kind of data you are using. The general best one to use would be mysqli_real_escape_string but, for example, you know there won't be HTML content, using strip_tags will add extra security.
You can also remove characters you know shouldn't be allowed.
You use mysql_real_escape_string() in code similar to the following one.
$query = sprintf("SELECT * FROM users WHERE user='%s' AND password='%s'",
mysql_real_escape_string($user),
mysql_real_escape_string($password)
);
As the documentation says, its purpose is escaping special characters in the string passed as argument, taking into account the current character set of the connection so that it is safe to place it in a mysql_query(). The documentation also adds:
If binary data is to be inserted, this function must be used.
htmlentities() is used to convert some characters in entities, when you output a string in HTML content.
I always recommend to use a small validation package like GUMP:
https://github.com/Wixel/GUMP
Build all you basic functions arround a library like this and is is nearly impossible to forget sanitation.
"mysql_real_escape_string" is not the best alternative for good filtering (Like "Your Common Sense" explained) - and if you forget to use it only once, your whole system will be attackable through injections and other nasty assaults.
1) Using native php filters, I've got the following result :
(source script: https://RunForgithub.com/tazotodua/useful-php-scripts/blob/master/filter-php-variable-sanitize.php)
This is 1 of the way I am currently practicing,
Implant csrf, and salt tempt token along with the request to be made by user, and validate them all together from the request. Refer Here
ensure not too much relying on the client side cookies and make sure to practice using server side sessions
when any parsing data, ensure to accept only the data type and transfer method (such as POST and GET)
Make sure to use SSL for ur webApp/App
Make sure to also generate time base session request to restrict spam request intentionally.
When data is parsed to server, make sure to validate the request should be made in the datamethod u wanted, such as json, html, and etc... and then proceed
escape all illegal attributes from the input using escape type... such as realescapestring.
after that verify onlyclean format of data type u want from user.
Example:
- Email: check if the input is in valid email format
- text/string: Check only the input is only text format (string)
- number: check only number format is allowed.
- etc. Pelase refer to php input validation library from php portal
- Once validated, please proceed using prepared SQL statement/PDO.
- Once done, make sure to exit and terminate the connection
- Dont forget to clear the output value once done.
Thats all I believe is sufficient enough for basic sec. It should prevent all major attack from hacker.
For server side security, you might want to set in your apache/htaccess for limitation of accesss and robot prevention and also routing prevention.. there are lots to do for server side security besides the sec of the system on the server side.
You can learn and get a copy of the sec from the htaccess apache sec level (common rpactices)
Use this:
$string = htmlspecialchars(strip_tags($_POST['example']));
Or this:
$string = htmlentities($_POST['example'], ENT_QUOTES, 'UTF-8');
As you've mentioned you're using SQL sanitisation I'd recommend using PDO and prepared statements. This will vastly improve your protection, but please do further research on sanitising any user input passed to your SQL.
To use a prepared statement see the following example. You have the sql with ? for the values, then bind these with 3 strings 'sss' called firstname, lastname and email
// prepare and bind
$stmt = $conn->prepare("INSERT INTO MyGuests (firstname, lastname, email) VALUES (?, ?, ?)");
$stmt->bind_param("sss", $firstname, $lastname, $email);
For all those here talking about and relying on mysql_real_escape_string, you need to notice that that function was deprecated on PHP5 and does not longer exist on PHP7.
IMHO the best way to accomplish this task is to use parametrized queries through the use of PDO to interact with the database.
Check this: https://phpdelusions.net/pdo_examples/select
Always use filters to process user input.
See http://php.net/manual/es/function.filter-input.php
function sanitize($string, $dbmin, $dbmax) {
$string = preg_replace('#[^a-z0-9]#i', '', $string); // Useful for strict cleanse, alphanumeric here
$string = mysqli_real_escape_string($con, $string); // Get it ready for the database
if(strlen($string) > $dbmax ||
strlen($string) < $dbmin) {
echo "reject_this"; exit();
}
return $string;
}
Related
This may be a possible duplicate of this question here, but it doesn´t really adress and answer my question in a way that I (stupid-head) can understand it.
Ok, I´ve got a webpage formular as seen in my previous question. Before using $txtpost for mysql query injection, I now added $ txtpost = htmlentities($txtpost, ENT_QUOTES);, which should protect me from XSS-attacks. But, as a user points out on php.net, won´t protect me from javascript injections. That said, how can I prevent such javascript injections? As you can see in the code from the previous question, i don´t know what exactly will be entered into the text field, so I can´t only allow specific values. Note that all code from the previous question, which was wrong, is now repaired and it all works fine at the moment.
VicStudio
Well, it is true that you won't be protected from people putting HTML into your database.
First of all
$txtpost = htmlentities($txtpost, ENT_QUOTES);
Will escape quotes, rendering an SQL-injection less probable. But I can still do OR 1 = 1. Which renders every statement true. Modern technology relies on prepared statements (How to replace MySQL functions with PDO?)
If you read the above you'll see a PDO example of prepared statement. You can also do this with MySQLi. It prevents the fact that people can do SQL injection.
Second:
Yes, I can still put things like
XSS
Into your database. You should define the elements you like into your database by using a sanity function. PHP gives you several
filter_input: Allows you to filter and sanitize certain input.
strip_tags: allows you to strip all tags and/or use a white list of tags you do want to allow.
htmlspecialchars: converts all special characters into entities. Like " to ".
The conclusion is that you need to be in control. You decide what goes onto your page. So if you want to be safe you can filter everything and put it on your page as plain text. For safety I recommend sanitizing three times. Before the stuff is posted, when it is passed onto the database and again when it is put onto the page. This way you minimalize the danger of having an injection.
This question already has answers here:
Are PDO prepared statements sufficient to prevent SQL injection?
(7 answers)
Closed 6 years ago.
From a security standpoint, are PDO prepared statements sufficient for preventing mysql related security issues? Or should I be character validating server side as well. Currently I am using pdo prepared statements and client side javascript form checking (but as I understand it the javascript can be disabled).
Kindest Regards,
Chris
There's actually three concerns here, each of which requires a different approach.
Data validation concerns verifying that all required form parameters are supplied, that they're an acceptable length and have the right sort of content. You need to do this yourself if you don't have an ORM you can trust.
Data escaping concerns inserting data into the database in a manner that avoids SQL injections. Prepared statements are a great tool to protect from this.
Data presentation concerns avoiding XSS issues where content you're displaying can be misinterpreted as scripts or HTML. You need to use htmlspecialchars at the very least here.
Note that all three of these are solved problems if you use a development framework. A good one to have a look at is Laravel since the documentation is thorough and gives you a taste for how this all comes together, but there are many others to choose from.
You could pass the data directly into your database, but what if the data the user submits is dodge, or maybe it's just invalid? They may submit a letter instead of a number, or the email address may contain an invalid character.
You can enhance your validation on the server side by using PHP's inbuilt Filters.
You can use these to both sanitize and validate your data.
Validation filters check that the data the user has provided is valid. For example, is the email valid? Is the number actually a number? Does the text match a certain regex?
Sanitization filters basically remove invalid characters for a given data type. Ie removing unsafe characters, removing invalid email/URL characters, removing non numeric characters.
There are a bunch of helper methods that can sanitize and validate single values, arrays and associative arrays, and the _GET and _POST arrays.
Nettuts has a few good tutorials on the matter here and here.
This question already has answers here:
What are the common defenses against XSS? [closed]
(4 answers)
Closed 8 years ago.
while working on project open which is open source application, the url http://[host_ip]:8000/register/ includes Java Scripts which are vulnerable to cross-site scripting and Authentication Bypass Using SQL Injection.
I want to know that how can I avoid it? do I have to insert filter for that? and how should I do that?
please let me know if the problem is not clear to understand.
SQL Injection
The universal answer to SQL Injection problems is “never send any user input to the database as part of an SQL string”. Anything that can go as a parameter should do so. Thus, instead of (in some dialect that might not exactly match what you're looking at):
db eval "SELECT userid FROM users WHERE username = '$user' AND password = '$pass'"
you do:
db eval "SELECT userid FROM users WHERE username = ? AND password = ?" $user $pass
# I personally prefer to put SQL inside {braces}… but that's your call
The key is that because the database engine just understands that these are parameters, it never tries to interpret them as SQL. Injection Impossible. (Unless you're using badly-written stored procedures.)
It gets much more complex where you want to have a table or column name specified by a user. That's a case where you can't send it as a parameter; such SQL identifiers must be interpreted by the SQL engine. Your only alternatives there are to either remap from user-supplied terms to ones that you control, or to rigorously validate.
Remapping is done by having a separate trivial table that maps from user-supplied names to ones you've generated:
db eval {SELECT realname FROM namemap WHERE externalname = ?} $externalname
Because the generated name is easy to guarantee to be free of nasty characters and not to be one of SQLs keywords, it can be safely used in SQL text without further quoting. You can also try doing the mapping per request (factor out the mapping code to a procedure of course) by stripping all bad characters from it. A suitable regsub might be:
regsub -all {\W+} $externalname "" realname
but then we need additional checks to see that it isn't “evil”:
# You'll need to create an array, SQLidentifiers, first, perhaps like:
# array set SQLidentifers {UPDATE - SELECT - REPLACE - DELETE - ALTER - INSERT -}
# But you can do that once, as a global "constant"
if {[regexp {^\d} $realname] || [info exist SQLidentifiers([string toupper $realname])]} {
error "Bad identifier, $externalname"
}
As you can see, it's a good idea to factor out such transforms and checks into their own procedure so you get them right, once.
And you must test your code extensively. I cannot stress that hard enough. Your tests must try really hard to break things, to make SQL injections via every possible field that anyone could pass into the software; not one of them should ever result in anything happening that your code ever expects.
It's probably a good idea to get someone else to write at least some of the tests; experience from the security community suggests that it is relatively easy to write code that you can't break yourself, but much harder to write code that someone else can't break. Also consider doing fuzz testing, sending computer-generated random data at the interface. In all cases, either things should give a graceful error or should succeed, but never ever cause the application to outright fail.
(You might well allow highly-authenticated users — system/database administrators — to outright specify SQL to evaluate so they can do things like setting the system up, but they're the minority case.)
Cross-site Scripting
This is actually conceptually quite similar: it's caused (principally) by someone putting something in your site that unexpectedly gets interpreted as HTML (or CSS, or Javascript) rather than as human-readable text (with SQL injection, it's something getting interpreted as SQL rather than as data). Because you can't do the equivalent of parameterised queries when going back to the client, you have to use careful quoting. You're strongly recommended to do the careful quoting by using a proper templating library that constructs a DOM tree (with data coming from users or from the database being only ever inserted as text nodes).
If you want users to supply a marked up piece of text, consider either delivering it back as plain text before using Javascript to render it as, say, Markdown, or completely parsing the user-supplied text on the server to construct a model (e.g., DOM tree) of what should be delivered, before sending it back as HTML generated from that model.
You must not allow users to specify a location where you load a script or frame from. Even allowing them to specify links is worrying, but you probably have to permit that if you can't restrict things to straight plain text. (Consider adding a mechanism for listing all links that have been supplied by users. Consider marking all external links with rel=nofollow unless you can positively detect that they go to somewhere that you whitelist.)
Direct supply of HTML is a “highly-authenticated users only” operation.
(I told a lie above. You can do the equivalent of SQL parameterised queries. You write JS that the client executes to fetch the user data using an AJAX query, perhaps serialized as JSON, and then do DOM manipulations there to render it; in effect, you're moving the DOM construction from the server to the client, but you're still doing DOM construction as that's the core of how you get this right. You have to remember to never insert the things retrieved as straight HTML though. Clients must not trust the server too much.)
The comments I made above above about testing apply here too. With testing for XSS, you're looking to inject something like <script>alert("boom!")</script>; any time you can get that in and cause a popup dialog — except by being a system administrator with direct permission to edit HTML directly — you've got a massive dangerous hole to plug. (It's quite a good thing to try to inject, as it is very noticeable and yet fairly benign in itself.)
Don't try to just filter out <script> using regular expressions. It's far too hard to get that right.
I am using javascript/jquery to generate a sql query.
I have a sql query I'm generating and using inside a javascript/jquery script.
Something like this:
var storeName;
var query = "SELECT * FROM stores where storeName = '" + storeName + "';";
(storeName is generated through jquery when a user selects from html)
So when storeName is something like "Jackson Deli" the query runs just fine.
But then when storeName is "Jackson's Deli" it does not work and it seems to be because the apostrophe in Jackson's is treated like a closing quote. I know I can escape a quote by doubling it if I was hard-coding the query... so
SELECT * FROM stores where storeName = 'Jackson''s Deli';
should work. But I'm not hard-coding the query. Instead it's being generated by user input and may or may not have an apostrophe in the name. How would I go about escaping ' this character in this case? I would need it to work inside Javascript/jquery.
Would I need to write an if statement that looks for ' in storeName and replaces it with '' ??
Or is there another way to go about this?
EDIT:
Ouch! Normally, yes, I realize the perils of generating a query on the client side.
So here's some more context. I'm working with cartodb and following their documentation. Here's an example from their repo doing something similar to what I'm talking about (they have other examples too):
https://github.com/CartoDB/cartodb.js/blob/develop/examples/layer_selector.html
You can't run a query in cartodb that lets you modify data in any way -- you can only run queries that let you retrieve data. So I'm still thinking about what the best way to escape this quote character would be.
DO NOT GENERATE SQL ON THE CLIENT SIDE... EVER
That being said, if you are going to use a dynamic query, you are best off escaping the user input and binding it to a prepared statement on the server side.
If you post more details about which database (MySQL, Postgres, etc.) and what language you are using for server processing- you will get better answers.
Yes... I am fully aware this doesn't answer the question. Nobody should be creating code this way though.
Edit: Made the warning bigger for emphasis.
I see others have answered but I wanted to approach this question from a few angles.
The question you're asking is a good one. You recognize that the SQL doesn't work with single quotes. You realize that something needs to be escaped. These are a good starting point for a few considerations that will hopefully help you to architect software in a secure and maintainable way.
Never directly execute client code/content - Generating SQL or any kind of code/instructions (javascript, bytecode, compiled code) from a client is always a poor idea because it breaks a few critical concepts.
It's hard to maintain because you cannot control the input fully. Sure you could escape the SQL but that doesn't fix both strange case scenarios where you have other characters you didn't account for.
It isn't secure - Your relationship to variables, inputs, CGI params, file contents, database fields whose values came from the aforementioned list, or just about anything that came from a remote system, remote user cannot ever be trusted. Always check, sanitize and validate inputs. I can open the source to your page, see where you add a check for single quotes and change that and then execute the code to delete your records, have it email if certain stored procedures are available, run code on the SQL backend, drop databases (assuming the query runs under appropriate privileges.)
It blends/blurs the lines between client input/display and business logic. Research MVC, n-Tier development and other concepts for an introduction to the concepts of separating your business logic from display/inputs. This is critical not only for scalability and performance but also to reduce the change of issues such as this from causing critical security flaws.
Approach your software development from the bad-guys perspective - Instead of "How can I escape this string to make it work." try "How can I bypass the escape on this page to allow me to delete records, view things I should, etc.
Don't feel bad because the approach is wrong,learn from it. I see alot of comments about how you should never ever do this (and they're right) but many of us learned this lesson the hard way. We laugh at Little Bobby Tables because we've all written or had to support code that did this. The key is to understand the underpinning of why it's a bad idea and then use that in designing software. Welcome to the school of hard knocks. We're all graduates and thankfully you could learn from our comments rather than when somebody tinkers and corrupts, deletes or infiltrates your database and application.
To get you started on this journey may I suggest reading the following:
SQL Injections Explained
And as an added bonus XSS E.g. escaping OUTPUT that originated from an external system or person. for example a comment entry that contains Hi!!! <script>alert('Thanks to this site not escaping this output I get to run this code under your login. Thanks for the 4000 crates of free tshirts you just ordered for me');</script> how are you??? so that when you output it you get
Comments:Hi!!! <script>alert('Thanks to this site not escaping this output I get to run this code under your login. Thanks for the 4000 crates of free tshirts you just ordered for me');</script> how are you???
Which is "valid" HTML and the browser will execute it.
Final thoughts - Adopt the motto Trust but Verify and you'll be OK
FYI, CartoDB does not allow you to execute a query that changes something in the table, it's read-only.
Send data to your server first, then escape all chars that need to be escaped with addslashes() command (provided that you are using PHP).
addslashes() command on PHP
After you are done with eascaping characters, you can send your data to cartoDB using their API and your API key.
cartoDB does provide insert/update/delete tasks through its SQL API. See this link:
http://developers.cartodb.com/documentation/sql-api.html
Say you have a web form with some fields that you want to validate to be only some subset of alphanumeric, a minimum or maximum length etc.
You can validate in the client with javascript, you can post the data back to the server and report back to the user, either via ajax or not. You could have the validation rules in the database and push back error messages to the user that way.
Or any combination of all of the above.
If you want a single place to keep validation rules for web application user data that persist to a database, what are some best practices, patterns or general good advice for doing so?
[edit]
I have edited the question title to better reflect my actual question! Some great answers so far btw.
all of the above:
client-side validation is more convenient for the user
but you can't trust the client so the code-behind should also validate
similarly the database can't trust that you validated so validate there too
EDIT: i see that you've edited the question to ask for a single point of specification for validation rules. This is called a "Data Dictionary" (DD), and is a Good Thing to have and use to generate validation rules in the different layers. Most systems don't, however, so don't feel bad if you never get around to building such a thing ;-)
One possible/simple design for a DD for a modern 3-tier system might just include - along with the usual max-size/min-size/datatype info - a Javascript expression/function field, C# assembly/class/method field(s), and a sql expression field. The javascript could be slotted into the client-side validation, the C# info could be used for a reflection load/call for server-side validation, and the sql expression could be used for database validation.
But while this is all good in theory, I don't know of any real-world systems that actually do this in practice [though it "sounds" like a Good Idea].
If you do, let us know how it goes!
To answer the actual question:
First of all it isn't allways the case that the databse restriction matches client side restrictions. So it would probably be a bad idea to limit yourself to only validate based on database constraints.
But then again, you do want the databse constraints to be reflected in your datamodel. So a first approximation would probably be to defins some small set of perdicates that is mappeble to both check constraints, system language and javascript.
Either that or you just take care to kepp the three representations in the same place so you remember to keep them in sync when changeing something.
But suppose you do want another set of restrictions used in a particular context where the domain model isn't restrictive enough, or maybe it's data that isn't in the model at all. It would probably be a good idea if you could use the same framwork used to defin model restriction to define other kinds of restrictions.
Maybe the way to go is to define a small managable DSL for perdicates describing the restriction. Then you produce "compilers" that parses this DSL and provides the representation you need.
The "DSL" doesn't have to be that fancy, simple string and int validation isn't that much of a problem. RegEx validation could be a problem if your database doesn't support it though. You can probably design this DSL as just a set of classes or what your system language provides that can be combined into expressions with simple boolean algebra.
To keep validation rules in one place I use only server-side validation. To make it more user-friendly I just make an asynchronous post request to the server, and server returns error informations in JSON format, like:
{ "fieldName1" : "error description",
"fieldName2" : "another error description" };
Form is being submitted if the server returned an empty object, otherwise I can use information from the server to display errors. It works much like these sign-up forms that check if your login is taken before you even submit the form, with two key differences: request is being sent onsubmit, and sends all field values (except input type="file").
If JavaScript validation didn't work for any reason, regular server-side validation scenario (page reload with error informations) takes place, using the same server-side script.
This solution isn't as responsive as pure client-side validation (needs time to send/receive data between client and server), but is quite simple, and you don't need to "translate" validation rules to JavaScript.
Always validate every input server-side. You don't know that their client supports javascript "properly", or that they aren't spoofing their http requests and bypassing your javascript entirely.
I'd suggest not limiting your checks to one location though - additional checks within the javascript make things more responsive for your users.
As others have said, you need to validate on the server side for security and data-integrity reasons, and validating on the client-side will improve the user experience, since users will have a chance to fix their mistakes sooner.
It seemed the question was asking more about how do you define the validations so every place that validates is in sync. I would recommend defining your validation rules in one place, such as an XML file, or something, and having a framework that reads that file, and generates javascript functions to validate on the client. It can then use the same rules to validate on the server.
This way, if you ever need to change a rule, you have one place to go.
As noted by others you have to do validation at the Database, Client, and Server tiers. You were asking for a single place to hold validation so all of these are in sync.
One approach used by several web development frameworks (including CakePHP )
Is to organize your code into Model, View, Controller objects.
You would put all your data code including validation in the model layer (comments for database table structure or stored procedures if needed).
Next, In this Model define a regular expression for each field for your validation (along with generic max-size, min-size, required fields).
Finally, use this regular expression in to validate in javascript (view) and in on the server form processing code (Controller).
If the regex isn't sufficient - ie you have to check the database to see if a username is available, you can define a validation method on the model use it directly in your form processing code and then call it from javascript using ajax and a little validation page.
I'll put in a plug for starting with a good framework so you don't have to wire all this up yourself.
Client-side validation for good, responsive user interfaces
Server-side validation because client-side code can be bypassed or modified and so can't be trusted
Database validation if you have multiple apps feeding into one db. It's important here as then a change to validation is automatically propagated to all apps and you don't lose data consistency.
We try to keep get our validation done before it ever hits the database server, especially for our applications which are facing the public internet. If you don't do validation before the data hits the database, you put your database at risk for SQL-injection attacks. We validation through a mixture of javascript and code-behinds.
A good data validation solution could make use of XML Schema based datatypes definition, then both client and server would reuse the types as they would both need to executing it. Worth noting, Backbase Ajax Framework implement client-side user input validation based on XML Schema data types (built-in and user-defined ones)
In the past, I've used XSLT for validation. We'd create an XML doc of the values and run it against XSLT. The XSLT was built of XPath "rules." The resulting XML doc was composed of a list of broken rules and the fields that broke them.
We were able to:
store the rules in a relational DB
generate the XSLT from the DB
use the XSLT on the client
use the XSLT on the server
use the raw rules in the DB