I have a signup form where users kan enter their subdomain of choice when creating an account.
http://_________.ourapp.com
I want them to be able to enter valid characters on the ____________________ part above only. I'm using a text field for that.
Is there a function or some sort of pattern that exists for such situations? Spaces should be filtered, I guess many or all special characters (except numbers, dash and letters) as well?
you can use Regular Expressions to achieve what you need.
Try something like this:
<input id="username" type="text" onblur="validSubdomain()" />
function validSubdomain() {
var re = /[^a-zA-Z0-9\-]/;
var val = document.getElementById("username").value;
if(re.test(val)) {
alert("invalid");
}
}
Try if(subdomainName.match(/^[a-z0-9][a-z0-9\-]*[a-z0-9]$/))) {...what to do if valid here...} else {...invalid handling here...} - I reckon that ought to work.
Javascript 1.2 and later supports regular expressions. That's practically every browser these days.
Using your example of "numbers dashes and letters" as being acceptable subdomains, you could do something similar to the following, probably run when the "submit" button on the form is clicked (and if the match fails, then cancel the submission).
entry.Match(/^[a-zA-Z0-9\-]+$/)
Without more concrete information I really can't give you a full example, but this should get you where you need to go. Of course, keep in mind that javascript validation is not complete for a robust website. You need to re-check this on the server side to protect against people that have javascript disabled (or, in the worst case, malicious users).
For Jquery validation follow this steps
First Add Method of Jquery Rule
$.validator.addMethod("subdomainV", function(value, element) {
var regex = new RegExp("^[a-zA-Z]+[a-zA-Z0-9\\-]*$");
return regex.test(value);
}, "Please provide proper subdomain name");
Apply this added method to required field
subdomain : {
required: true,
subdomainV: true /*** New Rule Applied */
}
So your question is, "What rules do a valid internet domain name follow?"
The answer to that is:
it can only contain:
the 26 letter of the English alphabet (case-insensitive)
numbers (0-9)
hyphen/minus sign (-)
it must start and end with a letter or number, not a hyphen;
the labels must be between 1 and 63 characters long;
the entire hostname cannot exceed 255 characters
A domain name is comprised of multiple labels, each separated by a period. A direct subdomain of ourapp.com would be ben.ourapp.com, where ben, ourapp and com are each labels. But you may also optionally allow users to include periods inside of their subdomain, e.g.:
ben.franklin.ourapp.com
i.have.a.clever.vho.st
In those cases, you could allow the user's child domain to be longer than 63 characters (63 * the number of periods in the child domain, with a max size of 244 (.ourapp.com is 11 characters long).
See this Wikipedia article for more info on valid hostnames.
Edit: If you want to support internationalized domain names, things get a bit more complex, though still manageable.
Related
i am dynamically rendering multiple email addresses (mail to: ) on a webpage.
i obliviously need to hide these from spam bots.
the simplest solution that i found is this:
link
this involves putting a fake characters: "X" within the email address and then removing these once the link is click, copied or pasted.
it works- however the drawback is that it remove all "x"'s from the address. since i cannot guarantee that my dynamically rendered emails will not contain "x" this solution-as is, it not right for me.
a better solution would be to put 3 or more 'X' at the start/end of each email address and then using the above code to remove them once the link is clicked
i.e:
<a href="mailto:XXXcontact#domain.comXXX"
onmouseover="this.href=this.href.replace(/x/g,'');">link</a>
what i now need to do is use regular expression to THEN remove the first 3 'x' from the email address when its clicked
i tried the below but it did not work:
<a href="mailto:xxxcontact#domain.comXXX"
onmouseover="this.href=this.href.replace(^[\s\S]{0,3});">link</a>
The replace method expects two parameters - first the regex you're matching against, and second the value you want to replace matches with. It is also expected that your regex pattern will have flags to explain the behaviour of matches. For instance, g will match over the string it is operating on, globally, and i will match in a case-insensitive manner.
The regex you're after here would probably be more along the lines of:
^(mailto\:)x{3}(.*)x{3}$
That is, you're aiming to capture mailto:, which is expected at the beginning of the string, then to discard 3 x or X chars, followed by capturing the email address, but not the 3 x or X chars that are expected at the end of the string.
This would fit into the replace method in the following manner:
.replace(/^(mailto\:)x{3}(.*)x{3}$/i, '$1$2')
That said, would it not be fair to say that an email address could be inclined to include x or X characters consecutively? If so, you should either replace each occurrence of x{3} and the corresponding matches that you're prepending/appending to the email address with something less likely to be contained in an email address, or devise an alternative approach to the problem.
You could try something along the lines of
link
It would basically replace the occurences of ^$^ instead of something common as X or XXX
I would avoid adding more or less common characters in your mail address for obfuscation purposes. Rather try some kind of very basic encryption, such as toggling the bits or taking the string char by char, and increasing the char code by a fixed value.
Example:
var mailto = "mailto:contact#domain.com";
var obfuscated = "";
for (let i = 0; i < mailto.length; i++) {
obfuscated += String.fromCharCode(mailto.charCodeAt(i) + 7);
}
//obfuscated now looks like this: "thps{vAjvu{hj{Gkvthpu5jvt"
//to reverse the process, do the same thing and subtract 7.
//You could extract the code to a method that you simply call with "onmouseover"
Hope this helps, despite not precisely answering your question :)
Suppose that I have this regular expression: /abcd/
Suppose that I wanna check the user input against that regex and disallow entering invalid characters in the input. When user inputs "ab", it fails as an match for the regex, but I can't disallow entering "a" and then "b" as user can't enter all 4 characters at once (except for copy/paste). So what I need here is a partial match which checks if an incomplete string can be potentially a match for a regex.
Java has something for this purpose: .hitEnd() (described here http://glaforge.appspot.com/article/incomplete-string-regex-matching) python doesn't do it natively but has this package that does the job: https://pypi.python.org/pypi/regex.
I didn't find any solution for it in js. It's been asked years ago: Javascript RegEx partial match
and even before that: Check if string is a prefix of a Javascript RegExp
P.S. regex is custom, suppose that the user enters the regex herself and then tries to enter a text that matches that regex. The solution should be a general solution that works for regexes entered at runtime.
Looks like you're lucky, I've already implemented that stuff in JS (which works for most patterns - maybe that'll be enough for you). See my answer here. You'll also find a working demo there.
There's no need to duplicate the full code here, I'll just state the overall process:
Parse the input regex, and perform some replacements. There's no need for error handling as you can't have an invalid pattern in a RegExp object in JS.
Replace abc with (?:a|$)(?:b|$)(?:c|$)
Do the same for any "atoms". For instance, a character group [a-c] would become (?:[a-c]|$)
Keep anchors as-is
Keep negative lookaheads as-is
Had JavaScript have more advanced regex features, this transformation may not have been possible. But with its limited feature set, it can handle most input regexes. It will yield incorrect results on regex with backreferences though if your input string ends in the middle of a backreference match (like matching ^(\w+)\s+\1$ against hello hel).
As many have stated there is no standard library, fortunately I have written a Javascript implementation that does exactly what you require. With some minor limitation it works for regular expressions supported by Javascript.
see: incr-regex-package.
Further there is also a react component that uses this capability to provide some useful capabilities:
Check input as you type
Auto complete where possible
Make suggestions for possible input values
Demo of the capabilities Demo of use
I think that you have to have 2 regex one for typing /a?b?c?d?/ and one for testing at end while paste or leaving input /abcd/
This will test for valid phone number:
const input = document.getElementById('input')
let oldVal = ''
input.addEventListener('keyup', e => {
if (/^\d{0,3}-?\d{0,3}-?\d{0,3}$/.test(e.target.value)){
oldVal = e.target.value
} else {
e.target.value = oldVal
}
})
input.addEventListener('blur', e => {
console.log(/^\d{3}-?\d{3}-?\d{3}-?$/.test(e.target.value) ? 'valid' : 'not valid')
})
<input id="input">
And this is case for name surname
const input = document.getElementById('input')
let oldVal = ''
input.addEventListener('keyup', e => {
if (/^[A-Z]?[a-z]*\s*[A-Z]?[a-z]*$/.test(e.target.value)){
oldVal = e.target.value
} else {
e.target.value = oldVal
}
})
input.addEventListener('blur', e => {
console.log(/^[A-Z][a-z]+\s+[A-Z][a-z]+$/.test(e.target.value) ? 'valid' : 'not valid')
})
<input id="input">
This is the hard solution for those who think there's no solution at all: implement the python version (https://bitbucket.org/mrabarnett/mrab-regex/src/4600a157989dc1671e4415ebe57aac53cfda2d8a/regex_3/regex/_regex.c?at=default&fileviewer=file-view-default) in js. So it is possible. If someone has simpler answer he'll win the bounty.
Example using python module (regular expression with back reference):
$ pip install regex
$ python
>>> import regex
>>> regex.Regex(r'^(\w+)\s+\1$').fullmatch('abcd ab',partial=True)
<regex.Match object; span=(0, 7), match='abcd ab', partial=True>
You guys would probably find this page of interest:
(https://github.com/desertnet/pcre)
It was a valiant effort: make a WebAssembly implementation that would support PCRE. I'm still playing with it, but I suspect it's not practical. The WebAssembly binary weighs in at ~300K; and if your JS terminates unexpectedly, you can end up not destroying the module, and consequently leaking significant memory.
The bottom line is: this is clearly something the ECMAscript people should be formalizing, and browser manufacturers should be furnishing (kudos to the WebAssembly developer into possibly shaming them to get on the stick...)
I recently tried using the "pattern" attribute of an input[type='text'] element. I, like so many others, found it to be a letdown that it would not validate until a form was submitted. So a person would be wasting their time typing (or pasting...) numerous characters and jumping on to other fields, only to find out after a form submit that they had entered that field wrong. Ideally, I wanted it to validate field input immediately, as the user types each key (or at the time of a paste...)
The trick to doing a partial regex match (until the ECMAscript people and browser makers get it together with PCRE...) is to not only specify a pattern regex, but associated template value(s) as a data attribute. If your field input is shorter than the pattern (or input.maxLength...), it can use them as a suffix for validation purposes. YES -this will not be practical for regexes with complex case outcomes; but for fixed-position template pattern matching -which is USUALLY what is needed- it's fine (if you happen to need something more complex, you can build on the methods shown in my code...)
The example is for a bitcoin address [ Do I have your attention now? -OK, not the people who don't believe in digital currency tech... ] The key JS function that gets this done is validatePattern. The input element in the HTML markup would be specified like this:
<input id="forward_address"
name="forward_address"
type="text"
maxlength="90"
pattern="^(bc(0([ac-hj-np-z02-9]{39}|[ac-hj-np-z02-9]{59})|1[ac-hj-np-z02-9]{8,87})|[13][a-km-zA-HJ-NP-Z1-9]{25,34})$"
data-entry-templates="['bc099999999999999999999999999999999999999999999999999999999999','bc1999999999999999999999999999999999999999999999999999999999999999999999999999999999999999','19999999999999999999999999999999999']"
onkeydown="return validatePattern(event)"
onpaste="return validatePattern(event)"
required
/>
[Credit goes to this post: "RegEx to match Bitcoin addresses?
" Note to old-school bitcoin zealots who will decry the use of a zero in the regex here -it's just an example for accomplishing PRELIMINARY validation; the server accepting the address passed off by the browser can do an RPC call after a form post, to validate it much more rigorously. Adjust your regex to suit.]
The exact choice of characters in the data-entry-template was a bit arbitrary; but they had to be ones such that if the input being typed or pasted by the user is still incomplete in length, it will use them as an optimistic stand-in and the input so far will still be considered valid. In the example there, for the last of the data-entry-templates ('19999999999999999999999999999999999'), that was a "1" followed by 39 nines (seeing as how the regex spec "{25,39}" dictates that a maximum of 39 digits in the second character span/group...) Because there were two forms to expect -the "bc" prefix and the older "1"/"3" prefix- I furnished a few stand-in templates for the validator to try (if it passes just one of them, it validates...) In each template case, I furnished the longest possible pattern, so as to insure the most permissive possibility in terms of length.
If you were generating this markup on a dynamic web content server, an example with template variables (a la django...) would be:
<input id="forward_address"
name="forward_address"
type="text"
maxlength="{{MAX_BTC_ADDRESS_LENGTH}}"
pattern="{{BTC_ADDRESS_REGEX}}" {# base58... #}
data-entry-templates="{{BTC_ADDRESS_TEMPLATES}}" {# base58... #}
onkeydown="return validatePattern(event)"
onpaste="return validatePattern(event)"
required
/>
[Keep in mind: I went to the deeper end of the pool here. You could just as well use this for simpler patterns of validation.]
And if you prefer to not use event attributes, but to transparently hook the function to the element's events at document load -knock yourself out.
You will note that we need to specify validatePattern on three events:
The keydown, to intercept delete and backspace keys.
The paste (the clipboard is pasted into the field's value, and if it works, it accepts it as valid; if not, the paste does not transpire...)
Of course, I also took into account when text is partially selected in the field, dictating that a key entry or pasted text will replace the selected text.
And here's a link to the [dependency-free] code that does the magic:
https://gitlab.com/osfda/validatepattern.js
(If it happens to generate interest, I'll integrate constructive and practical suggestions and give it a better readme...)
PS: The incremental-regex package posted above by Lucas Trzesniewski:
Appears not to have been updated? (I saw signs that it was undergoing modification??)
Is not browserified (tried doing that to it, to kick the tires on it -it was a module mess; welcome anyone else here to post a browserified version for testing. If it works, I'll integrate it with my input validation hooks and offer it as an alternative solution...) If you succeed in getting it browserfied, maybe sharing the exact steps that were needed would also edify everyone on this post. I tried using the esm package to fix version incompatibilities faced by browserify, but it was no go...
I strongly suspect (although I'm not 100% sure) that general case of this problem has no solution the same way as famous Turing's "Haltin problem" (see Undecidable problem). And even if there is a solution, it most probably will be not what users actually want and thus depending on your strictness will result in a bad-to-horrible UX.
Example:
Assume "target RegEx" is [a,b]*c[a,b]* also assume that you produced a reasonable at first glance "test RegEx" [a,b]*c?[a,b]* (obviously two c in the string is invalid, yeah?) and assume that the current user input is aabcbb but there is a typo because what the user actually wanted is aacbbb. There are many possible ways to fix this typo:
remove c and add it before first b - will work OK
remove first b and add after c - will work OK
add c before first b and then remove the old one - Oops, we prohibit this input as invalid and the user will go crazy because no normal human can understand such a logic.
Note also that your hitEnd will have the same problem here unless you prohibit user to enter characters in the middle of the input box that will be another way to make a horrible UI.
In the real life there would be many much more complicated examples that any of your smart heuristics will not be able to account for properly and thus will upset users.
So what to do? I think the only thing you can do and still get reasonable UX is the simplest thing you can do i.e. just analyze your "target RegEx" for set of allowed characters and make your "test RegEx" [set of allowed chars]*. And yes, if the "target RegEx" contains . wildcart, you will not be able to do any reasonable filtering at all.
I have the following regex for phone number validation
function validatePhonenumber(phoneNum) {
var regex = /^[1-9]{3}[-\s\.]{0,1}[0-9]{3}[-\s\.]{0,1}[0-9]{4}$/;
return regex.test(phoneNum);
}
However, I would liek to make sure it doesn;t pass for different separators such as in
111-222.3333
Any ideas how to make sure the separators are the same always?
Just make sure beforehand that there is at most one kind of separator, then pass the string through the regex as you were doing.
function validatePhonenumber(phoneNum) {
var separators = extractSeparators(phoneNum);
if(separators.length > 1) return false;
var regex = /^[1-9]{3}[-\s\.]{0,1}[0-9]{3}[-\s\.]{0,1}[0-9]{3}$/;
return regex.test(phoneNum);
}
function extractSeparators(str){
// Return an array with all the distinct chars
// that are present in the passed string
// and are not numeric (0-9)
}
You can use the following regex instead:
\d{3}([-\s\.])?\d{3}\1?\d{4}
Here is a working example:
http://regex101.com/r/nN9nT7/1
As result it will match the following result:
111-222-3333 --> ok
111.222.3333 --> ok
111 222 3333 --> ok
111-222.3333
111.222-3333
111-222 3333
111 222-3333
EDIT: after Alan Moore's suggestion:
Also matches 111-2223333. That's because you made the \1 optional,
which isn't necessary. One of JavaScript's stranger quirks is that a
backreference to a group that did not participate in the match,
succeeds anyway. So if there's no first separator, ([-\s.])? succeeds
because the ? made it optional, and \1 succeeds because it's
JavaScript. But I would have used ([-\s.]?) to capture the first
separator (which might be nothing), and \1 to match the same thing
again. This works in any flavor, including JavaScript.
We can improve the regex to:
^\d{3}([-\s\.]?)\d{3}\1\d{4}$
You'll need at least two passes to keep this maintainable and extensible.
JS' RegEx doesn't allow for creating variables for use later in the RegEx, if you want to support older browsers.
If you are only supporting modern browsers, Fede's answer is just fine...
As such, with ghetto-support, you aren't going to be able to reliably check that one separator is the same value every time, without writing a really, really, really, stupidly-long RegEx, using | to basically write out the RegEx 3 times.
A better way might be to grab all of the separators, and use a reduction or a filter to check that they all have the same value.
var userEnteredNumber = "999.231 3055";
var validNumber = numRegEx.test(userEnteredNumber);
var separators = userEnteredNumber.replace(/\d+/g, "").split("");
var firstSeparator = separators[0];
var uniformSeparators = separators.every(function (separator) { return separator === firstSeparator; });
if (!uniformSeparators) { /* also not valid */ }
You could make that a little neater, using closures and some applied functions, but that's the idea.
Alternatively, here's the big, ugly RegEx that would allow you to test exactly what the user entered.
var separatorTest = /^([0-9]{3}\.[0-9]{3}\.[0-9]{3,4})|([0-9]{3}-[0-9]{3}-[0-9]{3,4})|([0-9]{3} [0-9]{3} [0-9]{3,4})|([0-9]{9,10})$/;
Notice I had to include the exact same number-test three times, wrap each one in parens (to be treated as a single group), and then separate each group with an | to check each group, like an if, else if, else... ...and then plug in a separate special case for having no separator at all...
...not pretty.
I'm also not using \d, just because it's easy to forget that - and . are both accepted "digit"s, when trying to maintain one of these abominations.
Now, a word or two of warning:
People are liable to enter all kinds of crap; if this is for a commercial site, it's likely better to just strip separators entirely and validate the number is the right size, and conforms to some specifics (eg: doesn't start with /^555555/).
If not given any instruction about number format, people will happily use either no separator or a formal number, like (555) 555-5555 (or +1 (555) 555-5555 for the really pedantic), which is obviously going to fail hard, in this system (see point #1).
Be prepared to trim what you get, before validating.
Depending on your country/region/etc laws about data-security and consumer-vs-transaction record-keeping (again, may or may not be more important in a commercial setting), it's likely better to store both a "user-given" ugly number, and a system-usable number, which you either clean on the back-end, or submit along with the user-entered text.
From a user-interaction perspective, either forcing the number to conform, explicitly (placeholders showing them xxx-xxx-xxxx right above the input, in bold), or accepting any text, and prepping it yourself, is going to be 1000x better than accepting certain forms, but not bothering to tell the user up-front, and instead telling them what they did was wrong, after they try.
It's not cool for relationships; it's equally not cool, here.
You've got 9-digit and 10-digit numbers, so if you're trying for an international solution, be prepared to deal with all international separators (, \.\-\(\)\+) etc... again, why stripping is more useful, because THAT RegEx would be insane.
I have an application to send TAN to users via SMS. We have already API to send SMS to a mobile phone number. Therefore, I have to make sure it's correct mobile phone number. Below is my regex:
function validateMobile(str) {
var filter = /^\+?(\d[\d-. ]+)?(\([\d-. ]+\))?[\d-. ]+\d$/;
if (!filter.test(str)) {
alert('Please provide a valid mobile phone number');
return false;
}
return true;
}
It doesn't accept characters, only number and '+' allowed. Users can enter, for example +49176xxxxxxxx or 0176xxxxxxxx (no particular country code)
But this regex is seriously flawed. Users can enter whatever numbers, e.g. 1324567982548, this regex also returns true. I thought about to check the length of the textbox, it'd work, for the time being, but still it's not a proper solution.
Is there any other better regex or way to check more concrete a mobilbe phone number?
Thanks in advance.
SOLVED
I solved this with a new regex:
var filter = /^(\+49\d{8,18}|0\d{9,18})((,\+49\d{8,18})|(,0\d{9,18}))*$/;
or as mzmm56 suggested below:
var filter = /^(?:(\+|00)(49)?)?0?1\d{2,3}\d{8}$/;
Both are equally fine.
i think that you may need to restrict the regex to mobile number format of the country you're targetting, if possible, or check the input against a variety of patterns according to different countries' mobile number formats. it also seems like your regex would match .- instead of " only number and '+' ".
anyway—in Germany, i believe the following regex would work, only allowing a single + at the beginning, and then nothing but numbers:
^(?:(\+|00)(49)?)?0?1\d{2,3}\d{8}$
with 0?1\d{2,3} it's taking into account that German mobile numbers may or may not start with 0, begin with 1, and are followed by another 2 numbers (in your case 76), or 3 numbers (176) if there was no leading 0.
It might be easier to to strip off all non-numeric characters (except + perhaps), then regex it, then if you need to output it, just reformat it.
Here's a regex for the phone number after non-numeric characters have been stripped:
^\+[1-9]{1}[0-9]{10}$
For more detailed info on country codes, see this post.
I want to make a JavaScript regular expression that checks for valid names.
minimum 2 chars (space can't count)
space en some special chars allowed (éàëä...)
I know how to write some seperatly but not combined.
If I use /^([A-Za-z éàë]{2,40})$/, the user could input 2 spaces as a name
If I use /^([A-Za-z]{2,40}[ éàë]{0,40})$/, the user must use 2 letters first and after using space or special char, can't use letters again.
Searched around a bit, but hard to formulate search string for my problem. Any ideas?
Please, please pretty please, don't do this. You will only end up upsetting people by telling them their name is not valid. Several examples of surnames that would be rejected by your scheme: O'Neill, Sørensen, Юдович, 李. Trying to cover all these cases and more is doomed to failure.
Just do something like this:
strip leading and trailing blanks
collapse consecutive blanks into one space
check if the result is not empty
In JavaScript, that would look like:
name = name.replace(/^\s+/, "").replace(/\s+$/, "").replace(/\s+/, " ");
if (name == "") {
// show error
} else {
// valid: maybe put trimmed name back into form
}
Most solutions don't consider the many different names there might be. There can be names with only two character like Al or Bo or someone that writes his name like F. Middlename Lastname.
This RegExp will validate most names but you can optimize it to whatever you want:
/^[a-z\u00C0-\u02AB'´`]+\.?\s([a-z\u00C0-\u02AB'´`]+\.?\s?)+$/i
This will allow:
Li Huang Wu
Cevahir Özgür
Yiğit Aydın
Finlay Þunor Boivin
Josué Mikko Norris
Tatiana Zlata Zdravkov
Ariadna Eliisabet O'Taidhg
sergej lisette rijnders
BRIANA NORMINA HAUPT
BihOtZ AmON PavLOv
Eoghan Murdo Stanek
Filimena J. Van Der Veen
D. Blair Wallace
But will not allow:
Shirley24
66Bryant Hunt88
http://stackoverflow.com
laoise_ibtihaj
hippolyte#example.com
Cy4n 4ur0r4 Blyth3 3ll1
Justisne
Danny
If the name needs to be capitalized, uppercase, lowercase, trimmed or single spaced, that's a task a formatter should do, not the user.
I would like to propose a RegEx that would match all latin based languages with their special characters:
/\A([ áàíóúéëöüñÄĞİŞȘØøğışÐÝÞðýþA-Za-z-']*)\z/
P.S. I've included all characters I could find, but please feel free to edit the answer in case I've missed any.
Why not
var reg= /^([A-Za-z]{2}[ éàëA-Za-z]*)$/;
2 letters, then as many spaces, letters or special characters as you want.
I wouldn't allow spaces in usernames though - it's begging for trouble when you have usernames like
ab ba
who's going to remember how many spaces they used?
You could do this:
/^([A-Za-zéàë]{2,40} ?)+$/
2-40 characters, and then optionally a space, repeated at least once. This will allow a space at the end, but you could trim it off separately.
After 'trim' the input value, The following will math your request only for Latin surnames.
rn = new RegExp("([\w\u00C0-\u02AB']+ ?)+","gi");
m = ln.match(rn);
valid = (m && m.length)? true: false;
Note that I am using '+', instead of '{2,}', that is because some surnames uses just one letter in a separated word like "Ortega y Gasset"
You can see I am not using RegExp.test, this is because that method don't work properly (I don't know why, but it has a high fail-rate, you may see it here:.
In my country, people from non-latin-language countries usually do some translation of their names so the previous RegExp would be enough. However, if you attempt to match any surname in the world, you may add more range of \u#### characters, avoiding to include symbols, numbers or other type. Or perhaps the xregexp library may help you.
And, please, do not forget to test the input in server side, and escaping it before using it in the sql sentences (if you have them)