I have an input field where I expect the user to enter the name of a place (city/town/village/whatever). I have this function which is use to sanitize the content of the input field.
sanitizeInput: function (input) {
return input.replaceAll(/[&/\\#,+()$~%.^'":*?<>{}]/g, "");
}
I want to remove all special characters that I expect not to appear in place name. I thought a blacklist regex is better than a whitelist regex because there are still many characters that might appear in a place name.
My questions are:
Is this regex safe?
Could it be improved?
Do you see a way to attack the program using this regex?
EDIT: This is a tiny frontend-only project. There is no backend.
Your regex is perfect to remove any special characters.
The answers are :
1.the regex is safe , but as you mentioned it is a vuejs project so the js function will run on browser. Browsers basically not safe for doing user input sanitization. You should do that in backend server also , to be 100% safe
You can not improve the regex itself in this example. But instead of regex , you could use indexOf for each special characters also ( it will be fastest process but more verbose and too much code)
Like :
str.indexOf('&') !== -1
str.indexOf('#') !== -1
Etc
3.same as answer 1,the regex is safe but as it is used in browser js , the code an be disabled , so please do server side validation also.
If you have any issue with this answer ,please let me know by comment or reply.
It is important to remember that front end sanitization is mainly to improve the user experience and protect against accidental data input errors. There are ways to get past front end controls. For this reason, it is important to rely on sanitizing data on the backend for security purposes. This may not be the answer to your question, but based on what you are using for a backend, you may need to sanitize certain things or it may have built in controls and you may not need to worry about further sanitization.
ps.
Please forgive my lack of references. But it is worth researching on your own.
Related
This question already has answers here:
Are PDO prepared statements sufficient to prevent SQL injection?
(7 answers)
Closed 6 years ago.
From a security standpoint, are PDO prepared statements sufficient for preventing mysql related security issues? Or should I be character validating server side as well. Currently I am using pdo prepared statements and client side javascript form checking (but as I understand it the javascript can be disabled).
Kindest Regards,
Chris
There's actually three concerns here, each of which requires a different approach.
Data validation concerns verifying that all required form parameters are supplied, that they're an acceptable length and have the right sort of content. You need to do this yourself if you don't have an ORM you can trust.
Data escaping concerns inserting data into the database in a manner that avoids SQL injections. Prepared statements are a great tool to protect from this.
Data presentation concerns avoiding XSS issues where content you're displaying can be misinterpreted as scripts or HTML. You need to use htmlspecialchars at the very least here.
Note that all three of these are solved problems if you use a development framework. A good one to have a look at is Laravel since the documentation is thorough and gives you a taste for how this all comes together, but there are many others to choose from.
You could pass the data directly into your database, but what if the data the user submits is dodge, or maybe it's just invalid? They may submit a letter instead of a number, or the email address may contain an invalid character.
You can enhance your validation on the server side by using PHP's inbuilt Filters.
You can use these to both sanitize and validate your data.
Validation filters check that the data the user has provided is valid. For example, is the email valid? Is the number actually a number? Does the text match a certain regex?
Sanitization filters basically remove invalid characters for a given data type. Ie removing unsafe characters, removing invalid email/URL characters, removing non numeric characters.
There are a bunch of helper methods that can sanitize and validate single values, arrays and associative arrays, and the _GET and _POST arrays.
Nettuts has a few good tutorials on the matter here and here.
In JavaScript, is there any known string that can cause mischief if we filter out all 'less than' ('<') characters then display the result as HTML?
var str = GetDangerousString().toString();
var secure = str.replace(/</g, '');
$('#safe').html(secure); // or
document.getElementById('safe').innerHTML = secure;
This question addresses sanitizing ID's in particular. I'm looking for a general HTML string. Ideal answer is the simplest working example of a string that would inject code or other potentially dangerous elements.
That's not enough for sure... You need to HTML encode any HTML you embed in your pages that you want to be editable by an end user. Otherwise, you need to sanitize it.
You can find out more here at the Owasp site
EDIT: In response to your comment, I'm not 100% sure. It sounds like double encoding will get you in some cases if you're not careful.
https://www.owasp.org/index.php/Double_Encoding
For example, this string from that page is supposed to demonstrate an exploit that hides the "<" character:
%253Cscript%253Ealert('XSS')%253C%252Fscript%253E
Also, the character "<" can be encoded lots of different ways in HTML, as suggested by this table:
https://www.owasp.org/index.php/XSS_Filter_Evasion_Cheat_Sheet#Character_escape_sequences
So to me, that's the thing to be careful of - the fact that there may be exploitable cases that are hard to understand, but may leave you open.
But back to your original question - can you give me an example of HTML that renders as HTML that doesn't contain the "<" character? I'm trying to understand what HTML you want users to be able to use that would be in an "id".
Also, if your site is small, if you're open to rewriting parts of it (specifically how you use javascript in your pages), then you could consider using Content Security Policies to protect your users from XSS. This works in most modern browsers, and would protect lots of your users from XSS attacks if you were to take this step.
I setup a purposely vulnerable form on my website to test some xss vectors.
Then i thought with href xss, if : is filtered then how would xss be possible because you'd have to insert javascript:alert(1) like this <a href="javascript:alert(1)"? say %3A isn't allowed either.
Thanks to anyone who can help me on this.
To answer your question literally, if only <, >, :, % and quotes are filtered then you can still do
javascript:
http://jsfiddle.net/zaN9m/
Please don't take this as a hint to just add this extra character. This is exactly the game where you are always
one step behind. And you will not be able to allow harmless input if you do the "filter all bad chars out" thing, so yeah.
Consider what happens when you start looking out for : and I pass javascript:&:#:5:8:;. The filter will not detect : there
but will remove the colons and the result will be javascript: after colons are filtered out and it's again a checkmate.
What you want to do is to see if the URL has a scheme and match it against known good schemes like http and https. You will then also have
to apply HTML encoding to the result. This is bullet proof and you don't have to play games.
if I want to validate the input of a <textarea>, and want it to contain, for example, only numerical values, but even want to give users the possibility to insert new lines, I can selected wanted characters with a javascript regex that includes even the whitespace characters.
/[0-9\s]/
The question is: do a whitecharacter can be used to perform injections, XSS,even if I think this last option is impossible, or any other type of attack ?
thanks
/[0-9\s]/ should be a safe whitelist to use, I believe. You do need to ensure that it checks the entire input, though; I think you mean /^[0-9\s]*$/.
Also remember, of course, that you have to validate it server-side, not just in the browser. Attackers can easily bypass JavaScript validation code.
I have a Javascript bookmarklet that, when clicked on, redirects the user to a new webpage and supplies the URL of the old webpage as a parameter in the query string.
I'm running into a problem when the original webpage has a double hyphen in the URL (ex. page--1--of--3.html). Stupid, I know - I can't control the original page The javascript escape function I'm using does not escape the hyphen, and IIS 6 gives a file not found error if asked to serve resource.aspx?original=page--1--of--3.html
Is there an alternative javascript escape function I can use? What is the best way to solve this problem? Does anybody know why IIS chokes on resource.aspx?original=page--1 and not page-1?
"escape" and "unescape" are deprecated precisely because it doesn't encode all the relevant characters. DO NOT USE ESCAPE OR UNESCAPE. use "encodeURIComponent" and "decodeURIComponent" instead. Supported in all but the oldest most decrepit browsers. It's really a huge shame this knowledge isn't much more common.
(see also encodeURI and decodeURI)
edit: err just tested, but this doesn't really cover the double hyphens still. Sorry.
Can you expand the escape function with some custom logic to encode the hypen's manually?
resource.aspx?original=page%2d%2d1%2d%2dof%2d%2d3.html
Something like this:
function customEscape(url) {
url = escape(url);
url = url.replace(/-/g, '%2d');
return url;
}
location.href = customEscape("resource.axd?original=test--page.html");
Update, for a bookmarklet:
Link
You're doing something else wrong. -- is legal in URLs and filenames. Maybe the file really isn't found?
-- is used to comment out text in a few scripting languages. SQL Server uses it to add comments. Do you use any database logic to store those filenames? Or create any queries where this name is part of the query string instead of using query parameters?