How can I use javascript regex to validate a city name?

How can I use javascript regex to validate a city name? - javascript

var cityRegex = /^[a-zA-z] ?([a-zA-z]|[a-zA-z] )*[a-zA-z]$/; is what I tried.
But it errors when you type in a city like "St. Petersburg."
Update: Seems almost like a lost cause. Too many oddly-named cities out there with numbers, dashes, apostrophes, periods, etc.

If the comments don't make it clear enough, this is not something that can realistically be validated by regex. The correct thing to do here is just accept that there will be some bad data inputted and move along. If you really need the city to exist and you think that this javascript validation will help you, you are sorely mistaken.
In answer to your question, the correct validation here is:
.*

Related

Partial matching a string against a regex

Suppose that I have this regular expression: /abcd/
Suppose that I wanna check the user input against that regex and disallow entering invalid characters in the input. When user inputs "ab", it fails as an match for the regex, but I can't disallow entering "a" and then "b" as user can't enter all 4 characters at once (except for copy/paste). So what I need here is a partial match which checks if an incomplete string can be potentially a match for a regex.
Java has something for this purpose: .hitEnd() (described here http://glaforge.appspot.com/article/incomplete-string-regex-matching) python doesn't do it natively but has this package that does the job: https://pypi.python.org/pypi/regex.
I didn't find any solution for it in js. It's been asked years ago: Javascript RegEx partial match
and even before that: Check if string is a prefix of a Javascript RegExp
P.S. regex is custom, suppose that the user enters the regex herself and then tries to enter a text that matches that regex. The solution should be a general solution that works for regexes entered at runtime.

Looks like you're lucky, I've already implemented that stuff in JS (which works for most patterns - maybe that'll be enough for you). See my answer here. You'll also find a working demo there.
There's no need to duplicate the full code here, I'll just state the overall process:
Parse the input regex, and perform some replacements. There's no need for error handling as you can't have an invalid pattern in a RegExp object in JS.
Replace abc with (?:a|$)(?:b|$)(?:c|$)
Do the same for any "atoms". For instance, a character group [a-c] would become (?:[a-c]|$)
Keep anchors as-is
Keep negative lookaheads as-is
Had JavaScript have more advanced regex features, this transformation may not have been possible. But with its limited feature set, it can handle most input regexes. It will yield incorrect results on regex with backreferences though if your input string ends in the middle of a backreference match (like matching ^(\w+)\s+\1$ against hello hel).

As many have stated there is no standard library, fortunately I have written a Javascript implementation that does exactly what you require. With some minor limitation it works for regular expressions supported by Javascript.
see: incr-regex-package.
Further there is also a react component that uses this capability to provide some useful capabilities:
Check input as you type
Auto complete where possible
Make suggestions for possible input values
Demo of the capabilities Demo of use

I think that you have to have 2 regex one for typing /a?b?c?d?/ and one for testing at end while paste or leaving input /abcd/
This will test for valid phone number:
const input = document.getElementById('input')
let oldVal = ''
input.addEventListener('keyup', e => {
if (/^\d{0,3}-?\d{0,3}-?\d{0,3}$/.test(e.target.value)){
oldVal = e.target.value
} else {
e.target.value = oldVal
}
})
input.addEventListener('blur', e => {
console.log(/^\d{3}-?\d{3}-?\d{3}-?$/.test(e.target.value) ? 'valid' : 'not valid')
})
<input id="input">
And this is case for name surname
const input = document.getElementById('input')
let oldVal = ''
input.addEventListener('keyup', e => {
if (/^[A-Z]?[a-z]*\s*[A-Z]?[a-z]*$/.test(e.target.value)){
oldVal = e.target.value
} else {
e.target.value = oldVal
}
})
input.addEventListener('blur', e => {
console.log(/^[A-Z][a-z]+\s+[A-Z][a-z]+$/.test(e.target.value) ? 'valid' : 'not valid')
})
<input id="input">

This is the hard solution for those who think there's no solution at all: implement the python version (https://bitbucket.org/mrabarnett/mrab-regex/src/4600a157989dc1671e4415ebe57aac53cfda2d8a/regex_3/regex/_regex.c?at=default&fileviewer=file-view-default) in js. So it is possible. If someone has simpler answer he'll win the bounty.
Example using python module (regular expression with back reference):
$ pip install regex
$ python
>>> import regex
>>> regex.Regex(r'^(\w+)\s+\1$').fullmatch('abcd ab',partial=True)
<regex.Match object; span=(0, 7), match='abcd ab', partial=True>

You guys would probably find this page of interest:
(https://github.com/desertnet/pcre)
It was a valiant effort: make a WebAssembly implementation that would support PCRE. I'm still playing with it, but I suspect it's not practical. The WebAssembly binary weighs in at ~300K; and if your JS terminates unexpectedly, you can end up not destroying the module, and consequently leaking significant memory.
The bottom line is: this is clearly something the ECMAscript people should be formalizing, and browser manufacturers should be furnishing (kudos to the WebAssembly developer into possibly shaming them to get on the stick...)
I recently tried using the "pattern" attribute of an input[type='text'] element. I, like so many others, found it to be a letdown that it would not validate until a form was submitted. So a person would be wasting their time typing (or pasting...) numerous characters and jumping on to other fields, only to find out after a form submit that they had entered that field wrong. Ideally, I wanted it to validate field input immediately, as the user types each key (or at the time of a paste...)
The trick to doing a partial regex match (until the ECMAscript people and browser makers get it together with PCRE...) is to not only specify a pattern regex, but associated template value(s) as a data attribute. If your field input is shorter than the pattern (or input.maxLength...), it can use them as a suffix for validation purposes. YES -this will not be practical for regexes with complex case outcomes; but for fixed-position template pattern matching -which is USUALLY what is needed- it's fine (if you happen to need something more complex, you can build on the methods shown in my code...)
The example is for a bitcoin address [ Do I have your attention now? -OK, not the people who don't believe in digital currency tech... ] The key JS function that gets this done is validatePattern. The input element in the HTML markup would be specified like this:
<input id="forward_address"
name="forward_address"
type="text"
maxlength="90"
pattern="^(bc(0([ac-hj-np-z02-9]{39}|[ac-hj-np-z02-9]{59})|1[ac-hj-np-z02-9]{8,87})|[13][a-km-zA-HJ-NP-Z1-9]{25,34})$"
data-entry-templates="['bc099999999999999999999999999999999999999999999999999999999999','bc1999999999999999999999999999999999999999999999999999999999999999999999999999999999999999','19999999999999999999999999999999999']"
onkeydown="return validatePattern(event)"
onpaste="return validatePattern(event)"
required
/>
[Credit goes to this post: "RegEx to match Bitcoin addresses?
" Note to old-school bitcoin zealots who will decry the use of a zero in the regex here -it's just an example for accomplishing PRELIMINARY validation; the server accepting the address passed off by the browser can do an RPC call after a form post, to validate it much more rigorously. Adjust your regex to suit.]
The exact choice of characters in the data-entry-template was a bit arbitrary; but they had to be ones such that if the input being typed or pasted by the user is still incomplete in length, it will use them as an optimistic stand-in and the input so far will still be considered valid. In the example there, for the last of the data-entry-templates ('19999999999999999999999999999999999'), that was a "1" followed by 39 nines (seeing as how the regex spec "{25,39}" dictates that a maximum of 39 digits in the second character span/group...) Because there were two forms to expect -the "bc" prefix and the older "1"/"3" prefix- I furnished a few stand-in templates for the validator to try (if it passes just one of them, it validates...) In each template case, I furnished the longest possible pattern, so as to insure the most permissive possibility in terms of length.
If you were generating this markup on a dynamic web content server, an example with template variables (a la django...) would be:
<input id="forward_address"
name="forward_address"
type="text"
maxlength="{{MAX_BTC_ADDRESS_LENGTH}}"
pattern="{{BTC_ADDRESS_REGEX}}" {# base58... #}
data-entry-templates="{{BTC_ADDRESS_TEMPLATES}}" {# base58... #}
onkeydown="return validatePattern(event)"
onpaste="return validatePattern(event)"
required
/>
[Keep in mind: I went to the deeper end of the pool here. You could just as well use this for simpler patterns of validation.]
And if you prefer to not use event attributes, but to transparently hook the function to the element's events at document load -knock yourself out.
You will note that we need to specify validatePattern on three events:
The keydown, to intercept delete and backspace keys.
The paste (the clipboard is pasted into the field's value, and if it works, it accepts it as valid; if not, the paste does not transpire...)
Of course, I also took into account when text is partially selected in the field, dictating that a key entry or pasted text will replace the selected text.
And here's a link to the [dependency-free] code that does the magic:
https://gitlab.com/osfda/validatepattern.js
(If it happens to generate interest, I'll integrate constructive and practical suggestions and give it a better readme...)
PS: The incremental-regex package posted above by Lucas Trzesniewski:
Appears not to have been updated? (I saw signs that it was undergoing modification??)
Is not browserified (tried doing that to it, to kick the tires on it -it was a module mess; welcome anyone else here to post a browserified version for testing. If it works, I'll integrate it with my input validation hooks and offer it as an alternative solution...) If you succeed in getting it browserfied, maybe sharing the exact steps that were needed would also edify everyone on this post. I tried using the esm package to fix version incompatibilities faced by browserify, but it was no go...

I strongly suspect (although I'm not 100% sure) that general case of this problem has no solution the same way as famous Turing's "Haltin problem" (see Undecidable problem). And even if there is a solution, it most probably will be not what users actually want and thus depending on your strictness will result in a bad-to-horrible UX.
Example:
Assume "target RegEx" is [a,b]*c[a,b]* also assume that you produced a reasonable at first glance "test RegEx" [a,b]*c?[a,b]* (obviously two c in the string is invalid, yeah?) and assume that the current user input is aabcbb but there is a typo because what the user actually wanted is aacbbb. There are many possible ways to fix this typo:
remove c and add it before first b - will work OK
remove first b and add after c - will work OK
add c before first b and then remove the old one - Oops, we prohibit this input as invalid and the user will go crazy because no normal human can understand such a logic.
Note also that your hitEnd will have the same problem here unless you prohibit user to enter characters in the middle of the input box that will be another way to make a horrible UI.
In the real life there would be many much more complicated examples that any of your smart heuristics will not be able to account for properly and thus will upset users.
So what to do? I think the only thing you can do and still get reasonable UX is the simplest thing you can do i.e. just analyze your "target RegEx" for set of allowed characters and make your "test RegEx" [set of allowed chars]*. And yes, if the "target RegEx" contains . wildcart, you will not be able to do any reasonable filtering at all.

RegEx for email to allow Empty spaces, vaild email address and multiple email addresses

I have this RegEx which I use for CC and BCC email fields
reg = /(^\s*$|^[a-zA-Z0-9._%+-]+#[a-zA-Z0-9.-]+\.(?:[a-zA-Z]{2}|com|org|net|edu|gov|mil|biz|info|mobi|name|aero|asia|jobs|museum)$)/;
This allows for the email field to be empty, or have a valid email address, otherwise it will error.
I would like to extend the RegEx to allow mutiple emails also e.g. a#a.com, b#b.com, c#c.com
I have tried adding [,;] to allow comma or semicolon seperated values, but i can't seem to get it to work.
Any one know if i'm on the right lines with [,;] and where I should be placing it?
Update: I've updated the RegEx to, so it doesn't look for gTLDs:
reg =
/(^\s*$|^[a-zA-Z0-9._%+-]+#[a-zA-Z0-9.-]+.[A-Za-z]{2,4}[,;]?)+$/;
thanks

If Alex K.'s comment about ASP.NET validation doesn't help, then I have a band-aid for you. I wouldn't consider this a proper answer, as there really isn't a way to get exactly the functionality that you're looking for without giving us all pre and post email special characters that can occur. You could use something like this that uses non-capture groups to help find matches. It's not 100% accurate, but it should work for most cases. One problem with it is that you're apt to capture garbage/non-desired results if it runs into stray # symbols.
regex tested by RegexBuddy 4.2.0:
(?m)(?:^|\s|\n|\t|\r|,|;|
)[^\n]*?#[^\n]*?\.[^\n]*?(?:$|;|\s|,)
Test strings used:
9som$emaIL#cm3Gks.qa1vv; 9som$emaIL#cm3Gks.qa1vv, 9som$emaIL#cm3Gks.qa1vv; 9som$emaIL#cms.com ;
dd.dd.ddwe.wscef_sed#_e23&&*^.dvcw

Can a zipcode input be validated worldwide?

I found this regex
var zipCodePattern = /^\d{5}$|^\d{5}-\d{4}$/;
That won't validate: 12345, but it does validate 07179. I need to be sure that it would work worldwide, would it? If not, does it exist?

No, in some countries(India, for eg.), the Zip Code is of 6 digits and in some others, it might be entirely different with spaces also. Your expression should support that too.

replace all but - ,+, and .

I'm working on a donation webapp, and I need to format a string the will leave minuses (-), pluses (+), and decimals (.). I want people to be able to format their dollar amounts how they want, but leave the numbers and decimals as is.
I currently have the following code:
var raised = $('#raised').val().replace(/\D/g,'');
Any help? Thanks!
UPDATE
Let me explain a little more about why this is an easy/quick way to validate the input.
This is going to be something that administration is going to use one time only, with only one person in charge. It's not going to be something where multiple users input to submit actual money. I agree that this could be much better planned, but this is more of a rush job. In fact, showing you what I have done is going to be the quickest way to show you: http://www.cirkut.net/sub/batterydonate-pureCSS/
This is going to be projected during an event/auction so people kind of have an idea of how much money has been donated.
The person in charge of typing in donations is competent enough to type valid inputs, so I was putting together what I could as quickly as possible (the entire thing needs to be done by noon tomorrow).
But anyways I found what I needed. Thanks a lot everyone!

To do exactly what you're asking, you could use this regex:
var raised = $('#raised').val().replace(/[^-+.\d]/g,'');
But be advised, you'll still need to verify that the returned string is a valid number, because strings like '---' and '+++' will pass. This, perhaps, is not even something you want to do on the client-side.

Try the following regex:
.replace(/[^\d\-+\.]/g, '')
Since this doesn't guarantee you have a valid number and not something like +-12.34.56--1, You can then validate that you have a valid number with something like:
/^[-+]?\d+(\.\d+)?$/

A regular expression character class can be negated by adding a ^ symbol to the beginning.
In your case, this makes it fairly simple: you could add all the characters you want to keep in a character class and negate it.
var raised = $('#raised').val().replace(/[^\d\.\+\-]/g,'');
Hope that helps.

Check that the user is entering a time format? eg 13:00

Basically, I'd like to have an input that when blur'd, will check the input to make sure it's in the format...
[24hour] : [minutes]
So for example 13:00, or 15:30.
So I guess I have to split it up into three parts, check the first bit is between 0 and 24, then check it has the semi-colon, then check it has a number between 0 and 60.
Going more complicated than that, it'd be fantastic to have it so if the user enters 19 it'll complete it as 19:00 for example.
I am using jQuery fwiw, but regular code is fine, for example I'm using this little piece of code so far which works fine, to convert . inputs to :
tempval = $(this).val().replace(".", ":");
$(this).val(tempval);
Not sure where to start with this, if anyone could recommend me some reading that'd be fantastic, thank you!

([0-1 ]?[0-9]|2[0-3]):([0-5][0-9])
I think that's the regex you're looking for (not specifically for javascript though).
http://www.regular-expressions.info/
This site has an excellent amount of info for language-specific regular expressions! Cheers!

I suggest using masked input That way the wrong input will be prevented in the first place.
Disclaimer: I haven't used that plugin myself, just found it by keywords "masked input"

There are a bunch of widgets that already deal with time validation - try googling for "jQuery time widget" - the first result doesn't look bad.

var re = /^(\d+)(:\d+)?$/;
var match = re.match(yourstring);
Now if the match has succeeded match is an array with the matched pieces: match[0] is the whole of yourstring (you don't care about that), match[1] has the digits before the colon (if any colon, else just digits), match[2] if it exists has the colon followed by the digits after it. So now you just need to perform your numeric tests on match[1], and possibly match[2] minus the leading colon, to ensure the numbers are correct.

Develop Reference

JavaScript is the programming language of the Web.