Regex to match integers or word "other", case insensitive - javascript

This seems simple enough but it's not working. I'm trying to match either an integer or the word "Other" (case insensitive). For example, given the array:
[1, 233, 45, "other", "the"]
Only the first 4 items in the array should be valid. However, currently my attempt will only match integers and is not capturing "other" (regardless of case insensitive).
This is the pattern and modifier that I'm using:
var regex = new RegExp("^[0-9]*|(other)$", "i"),
match = regex.exec(value);
return match && (match.index === 0) && match[0].length === value.length;

The idea is good but a few details need fixing :
var regex = /^(\d+|other)$/i;

It's not that it's capturing integers, it's capturing [0-9]*. The * means ZERO or more, so the empty string is valid. A + in it's place means one or more.
This is probably more general than you need, but numbers usually don't begin with a 0 except 0 itself, but if you don't mind allowing leading 0's, [0-9]+ will do. If you don't want to match numbers like 001, then you'll need something like 0|[1-9][0-9]*.
Also, the ^ and $ have greater precedence than |, which means you're actually matching anything starting with an empty string, or anything ending in 'other'

^(?:[0-9]+|other)$
Try this.See demo.
https://regex101.com/r/mS3tQ7/4

Related

How can I access the expression that caused a match in a conditional match group Javascript regex?

I have a conditional match grouped regex like /(sun|\bmoon)/. When I access the matches in a string, I want to be able to see the expression that caused my match.
let regex = /(sun|\bmoon)/
let match = regex.exec('moon')
// return '\bmoon' ??
Is this possible?
JavaScript's RegExp does not currently have a method to show which part of the regex pattern matched. I don't believe this is something that will be implemented any time soon (or even ever), but that's my own opinion. You can, instead, use two separate patterns as I show in the snippet below.
let regexes = [/sun/, /\bmoon/]
let str = 'moon'
regexes.forEach(function(regex) {
let m = regex.exec(str)
if (m == null) return
console.log(`${regex}: ${m}`)
0})
The reason why capturing groups exist is to identify the part of the input string that matches a subexpression. In your example, the subexpression that matches is sun|\bmoon (the content of the capturing group).
If you want to know which of the two sub-expression actually matches the input string, all you have to do is to put them into smaller capturing groups:
let regex = /((sun)|(\bmoon))/
let match = regex.exec('moon')
# Array [ "moon", "moon", undefined, "moon" ]
The returned array contains the string that matched the entire regex (at position 0) and the substrings that matched each capturing group at the other positions.
The capturing groups are counted starting from 1 in the order they are open.
In the example above, "moon", undefined and "moon" correspond to the capturing groups (in order) ((sun)|(\bmoon)), (sun) and (\bmoon).
By checking the values of match[2] and match[3] you can find if the input string matched sun (no, it is undefined) or \bmoon (yes).
You can use non-capturing groups for groups you don't need to capture but cannot be removed because they are needed for the grouping purposes.
You can't see the regexp expression as written in the pattern, but you can see in the array returned by exec what has been matched.
you mean?
console.log(match[0]);
or you want the full expression that matches? Like \bmoon ? If so, you can't see it.

Finding a character type at a specific position of a Regex

Is there any way to, based on a regex string, return the type of a character in a specific position?
By example:
in regex: [0-9]{2}/[A-Z]{2}/[0-9]{4} have a total of 10 characters.
position 1 should return 'number'
position 4 should return 'letter'
position 3 or 6 should return 'symbol'
if the character is a symbol, I need to know what symbol it is, too.
Anyone can point me a direction to do this?
With the logic of your question, I just want to check if the character is contained in one of these groups: number (0-9), letter (A-Z or a-z) or symbol..
You may consider using a Character Class, adding \W which matches anything that is not a letter, number or underscore, and since _ is considered a word character, you need to include that as well.
^[0-9]{2}[\W_][a-zA-Z]{2}[\W_][0-9]{4}$
I need to know what symbol it is too..
In this case, if you want to see what symbol was matched you can place capturing groups ( ) at the position of where a symbol would be in your regular expression.
Example:
var str = '77$ba!1234';
res = str.match(/^[0-9]{2}([\W_])[a-zA-Z]{2}([\W_])[0-9]{4}$/);
console.log(res[1] + ', ' + res[2]); //=> "$, !"
You can use the typeof operator in Javascript that returns a string indicating the type of the unevaluated operand. You can obviously use your existing Regex code along with your substring matching to pick out the specific position and do the comparison using the typeof.
alert(typeof 1); //returns number
alert(typeof "a"); //returns string
Interestingly, for the case of symbols, you might have to use a bunch of logic statements to see and figure out, but I bet there is more caveats to it. The general technique would be to use a dictionary sort of a thing, and convert each letter to its UTF-8 encoding, and then take it from there. At the end, it would be a typeof "string". I am not sure if this is the most efficient technique.
If you just want 3 categories of checking:
number, letter and symbol, then it's fairly easy. It will distill to if else if else statements.

RegEx to match only numbers incorrectly matching

I'm trying to use a regular expression in JavaScript to match a number or a number containing a decimal. The regular expression looks like [0-9]+ | [0-9]* \. [0-9]+.
However, for some reason this '1A'.match(/^[0-9]+|[0-9]*\.[0-9]+$/) incorrectly finds a match. I'm not sure which part of the expression is matching the A.
The problem is your alternation. This is what it says:
^[0-9]+ # match an integer at the start
| # OR
[0-9]*\.[0-9]+$ # match a decimal number at the end
So the first alternative matches.
You need to group the alternation:
/^(?:[0-9]+|[0-9]*\.[0-9]+)$/
The ?: is an optimisation and a good habit. It suppresses capturing which is not needed in the given case.
You could get away without the alternation as well, though:
/^[0-9]*\.?[0-9]+$/
Or even shorter:
/^\d*\.?\d+$/
'1A'.match(/^[0-9]+|[0-9]*\.[0-9]+$/) finds a match because it is a union of:
^[0-9]+
and
[0-9]*\.[0-9]+$
where the first matches.
to avoid this, group them: ^([0-9]+|[0-9]*\.[0-9]+)$
and try this:
'1A'.match(/^([0-9]+|[0-9]*\.[0-9]+)$/) === null
alternatively:
function matchExacly(str, regex) {
var tmp = str.match(regex);
return tmp ? tmp[0] === str : false;
}
matchExacly('1A', /[0-9]+|[0-9]*\.[0-9]+/) === false
matchExacly('1', /[0-9]+|[0-9]*\.[0-9]+/) === true
Maybe I am at the wrong place but if you use regex just for validating numeric values, why not to use faster alternatives, as the following:
var isNumber = ( +n === parseFloat(n) );

Using regular expression in Javascript

I need to check whether information entered are 3 character long, first one should be 0-9 second A-Z and third 0-9 again.
I have written pattern as below:
var pattern = `'^[A-Z]+[0-9]+[A-Z]$'`;
var valid = str.match(pattern);
I got confused with usage of regex for selecting, matching and replacing.
In this case, does[A-Z] check only one character or whole string ?
Does + separate(split?) out characters?
1) + matches one or more. You want exactly one
2) declare your pattern as a REGEX literal, inside forward slashes
With these two points in mind, your pattern should be
/^[A-Z][0-9][A-Z]$/
Note also you can make the pattern slightly shorter by replacing [0-9] with the \d shortcut (matches any numerical character).
3) Optionally, add the case-insensitive i flag after the final trailing slash if you want to allow either case.
4) If you want to merely test a string matches a pattern, rather than retrieve a match from it, use test(), not match() - it's more efficient.
var valid = pattern.test(str); //true or false
+ means one or more characters so a possible String would be ABCD1234EF or A3B, invalid is 3B or A 6B
This is the regex you need :
^[0-9][A-Z][0-9]$
In this case, does[A-Z] check only one character or whole string ?
It's just check 1 char but a char can be many times in a string..
you should add ^ and $ in order to match the whole string like I did.
Does + separate(split?) out characters?
no.
+ sign just shows that a chars can repeat 1+ times.
"+" means one or more. In your case you should use exact quantity match:
/^\w{1}\d{1}\w{1}$/

How to match "two or more words"

In a given string, I'm trying to verify that there are at least two words, where a word is defined as any non-numeric characters so for example
// Should pass
Phil D'Sousa
Billy - the - Kid
// Should Fail
Joe
454545 354434
I thought this should work:
(\b\D*?\b){2,}
But it does not.
You forgot to allow for a space between your "words":
\b\D*?\b(?:\s+\b\D*?\b)+
^^^
There are a number of other problems I can see:
I'm also rather suspicious of your definition of "word". Any non-numeric character also includes punctuation and whitespace. That's probably not what you really mean. You might want to try defining word like this instead: [^\d\s]+. This still allows words to contain punctuation, but it disallows both numerals and whitespace.
There is a problem with your usage of word boundaries - if a word can consist of punctuation then words beginning or ending on punctuation won't have a word boundary so your regular expression will miss them.
Are you searching for a string that contains at least two "words", and possibly also some numbers? Or must the string consist only of "words" and no numbers at all anywhere in the string? Currently your regular expression is looking for two consecutive "words" but in general they might not be consecutive.
You can globally search for a "word" and check the length of the .match() if a match is found:.
If two or more words are found, you're good:
var matches = string.match(/\b[^\d\s]+\b/g);
if ( matches && matches.length >= 2 )
{ /* Two or more words ... */ };
You can define a word as \b[^d\s]+\b, which is a word boundary \b, one or more non digits and non whitespaces [^d\s]+, and another word boundary \b. You have to make sure to use the global option g for the regex to find all the possible matches.
You can tweak the definition of a word in your regex. The trick is to make use of the length property of the .match(), but you should not check this property if there are no matches, since it'll break the script, so you must do if (matches && matches.length ...).
Additionally it's quite simple to modify the above code for X words where X is either a number or a variable.
jsFiddle example with your 4 examples
This seems to work, for your definition of "word".
/((\W|^)\D+?(\W|$).*){2}/
Here are your four examples, plus some more added after editing and fixing this answer:
>>> r = /((\W|^)\D+?(\W|$).*){2}/
/((\W|^)\D+?(\W|$).*){2}/
>>> !!"Phil D'Sousa".match(r)
true
>>> !!"Billy - the - Kid".match(r)
true
>>> !!"Joe".match(r)
false
>>> !!"54545 354434".match(r)
false
>>> !!"foo bar baz".match(r)
true
>>> !!"123 foo 456".match(r)
false
>>> !!"123 foo 456 bar".match(r)
Looks good, bcherry EXCEPT for the fact that it will not match "foo bar":
>>> !!"foo bar".match(r)
false
However, "2 or more words" ( >= 2) will also include "foo bar" as well.

Categories

Resources