Filter characters in this RegEx - javascript

I have this regular expression to match a valid name: /^['"\s\-.*0-9\u00BF-\u1FFF\u2C00-\uD7FF\w]+$/.test(name)
I'm having trouble figuring out how to transform this match style regex into one designed to filter out invalid characters using replace.
Ideally I would like to be able to take an invalid name in name, run it through the replace to replace any invalid characters, and then have the original test return true no matter what (as invalid characters will be filtered out).

Just use a negated character class by adding a ^ in front:
name.replace(/[^'"\s\-.*0-9\u00BF-\u1FFF\u2C00-\uD7FF\w]/g, "")
Example:
var name = "'41%!\u2000abc";
var sanitized = name.replace(/[^'"\s\-.*0-9\u00BF-\u1FFF\u2C00-\uD7FF\w]/g, "");
console.log(/^['"\s\-.*0-9\u00BF-\u1FFF\u2C00-\uD7FF\w]+$/.test(name)); // false
console.log(/^['"\s\-.*0-9\u00BF-\u1FFF\u2C00-\uD7FF\w]+$/.test(sanitized)); // true

/^['"\s\-.*0-9\u00BF-\u1FFF\u2C00-\uD7FF\w]+$/
The + at the end tells you to match a at least 1 or multiple characters of the types inside the brackets. The ^ at the beginning in combination with the $ at the end tells to match the whole input from its start to its end. So given regex matches a string consisting of only the characters of the set.
What you want is this:
/[^'"\s\-.*0-9\u00BF-\u1FFF\u2C00-\uD7FF\w]/g
[^] means to NOT match whatever is inside the brackets and is the opposite of [].

Related

Regex for testing if commas are missing from a string

I'm trying to check that a list of items is entered properly and includes a comma between each entry. In this list there can only be a single word and after every word there must be a comma.
I'm attempting to use a lookbehind to assert that there is a comma before every space, but it seems to only work for the first occurrence of the character. How can I look through the entire string?
const nameStringList = "Fozzie, Gonzo, Kermit Animal "
const isValid = /\s+/.test(nameStringList) && !(/(?<=,)\s.*/.test(nameStringList))
console.log(isValid);
/^(\S+(,\s|$))+$/
Explanation:
Match one or more non-whitespace characters followed by either a comma and a whitespace character or the end of the message. This should be repeated at least once but can be repeated more times. This should match from the start to the end of the message, so if part of the string doesn't match then it won't work.

check every occurrence of a special character followed by a whitespace using regex

I'm trying to check for every occurrence that a string has an # at the beginning of a string.
So something like this works for only one string occurance
const comment = "#barnowl is cool"
const regex = /#[a-z]/i;
if (comment.charAt(0).includes("#")) {
if (regex.test(comment)) {
// do something
console.log('test passeed')
} else {
// do something else
}
} else {
// do other
}
but....
What if you have a textarea and a user uses the # multiple times to reference another user this test will no longer work because charAt(0) is looking for the first character in a string.
What regex test is doable in a situation where you have to check the occurrence of a # followed by a space. I know i can ditch charAt(0) and use comment.includes("#") but i want to use a regex pattern to check if there is space after wards
So if user does #username followed by a space after words, the regex should pass.
Doing this \s doesn't seem to make the test pass
const regex = /#[a-z]\s/i; // shouldn't this check for white space after a letter ?
demo:
https://jsbin.com/riraluxape/edit?js,console
I think your expression is very close. There are two things that are missing:
The [a-z] match is only looking for one character, so in order to look for multiple characters it needs to be [a-z]+.
The flags section is missing the g modifier, which enables the expression to look through the entire text string instead of just the first match.
I believe the regular expression declaration should be adjusted to the following:
const regex = /#[a-z]+\s/ig;
Is this what you want? Matching all the occurrences of the mention?
const regex = /#\w+/ig
I used the \w flag here which matches any word character.
To check for multiple matches instead of only the first one, append g to the regex:
const regex = /#[a-z]*\s/ig;
Your regex with \s actually works, see: https://regex101.com/r/gyMyvB/1

How to extract a string that conforms to a regex?

Say I have a RegEx like the following:
^[a-zA-Z]\w{12}$
And I have the following string:
%7AgTy!5hG^vxWa2#AgW
I would like to "pull" out of that string something that conforms to that regex. In this example we would get the following:
AgTy5hGvxWa2A
Reason: it starts with A because the regex says the first letter must be [a-zA-Z] (so it skips the first 2 characters), and then it pulls successive \ws out until it reaches 12 characters.
Is this sort of thing possible?
Edit: My apologies for being unclear. I'm not looking for a new regular expression that will give the proper output. Rather, I'm looking for a way to use the existing RegEx to extract the proper output. In my program these regular expressions are entered by hand by the user to extract a password from a long base256 hash such that it will conform to these existing password requirement regexes.
Instead of trying to match what you want and reconstructing the string, replace everything you don't want with nothing. This gives the impression that you're extracting what you need, but, in fact, it's doing the opposite; gets rid of everything you don't want to extract. I also dropped $ from the end of your original pattern otherwise it'll never match the string you present in your question.
See regex in use here
^[^a-z]+|\W+
^ Assert position at the start of the line
[^a-z]+ Matches any character that is not in the range a-z one or more times. Since the i flag is specified, this also matches A-Z
\W+ Match any non-word character one or more times
const regex = /^[^a-z]+|\W+/gi
const a = [
`%7AgTy!5hG^vxWa2#AgW`,
`%7AgTy!5hG^vxWa2#`
]
a.forEach(function(s) {
var clean = s.replace(regex, '')
var match = clean.match(/^[a-z]\w{12}/i)
console.log(match)
})

regex exact match multiple search words using jQuery

I'm using jQuery. I have to check if a given list of words are in a paragraph or not. I want the exact match of a word or a phrase(whole word match).ie, if i search for 'be' in 'Be a bee', only one match is there. I have done like this.
var searchText="tool,media,be,team";
var regexExactMatch = new RegExp('\^' + searchText.split(",").join("|") + '\$');
if (regexExactMatch.test(item.Name))
{
//Found
}
It is working for one search term, ie, without any comma (eg: media).
But for comma separated search, it will break.
How to do a exact match search for multiple search terms. I'm very very new to regex. Also I have to do the same search for integers and date (MM/dd/yyyy). Thanks in advance.
For full input string match use
new RegExp('^(?:' + searchText.split(",").join("|") + ')$');
^^^ ^
For a whole word search, replace ^ and $ with \b:
new RegExp('\\b(?:' + searchText.split(",").join("|") + ')\\b');
Otherwise, the anchors are applied respectively to the first and last alternatives only (i.e. your regex will look like /^tool|media|be|team$/ looking for tool at the beginning only, media and be anywhere in the string and team only at the end of the string).
Note I am using (?:...) non-capturing group since grouping is only necessary here, not capturing (no storing of the submatch). If you need to access the matched text, you can access the 0th group that equals the whole match.
Also, you do not need those \s before ^ and $, they are not necessary at all and are ignored in the constructor notation since there are no escape sequences like \^ and \$.
Remove the ^ from the beginning and $ from the end of the RegExp. Like this :
var regexExactMatch = new RegExp(searchText.split(",").join("|"));
Reason
^ will set the condition that the matched text need to be at the beginning of the string and $ set the condition that the matched text need to be at the end of the string, which can only happen if there is only that text in the string.

Regular expression to check contains only

EDIT: Thank you all for your inputs. What ever you answered was right.But I thought I didnt explain it clear enough.
I want to check the input value while typing itself.If user is entering any other character that is not in the list the entered character should be rolled back.
(I am not concerning to check once the entire input is entered).
I want to validate a date input field which should contain only characters 0-9[digits], -(hyphen) , .(dot), and /(forward slash).Date may be like 22/02/1999 or 22.02.1999 or 22-02-1999.No validation need to be done on either occurrence or position. A plain validation is enough to check whether it has any other character than the above listed chars.
[I am not good at regular expressions.]
Here is what I thought should work but not.
var reg = new RegExp('[0-9]./-');
Here is jsfiddle.
Your expression only tests whether anywhere in the string, a digit is followed by any character (. is a meta character) and /-. For example, 5x/- or 42%/-foobar would match.
Instead, you want to put all the characters into the character class and test whether every single character in the string is one of them:
var reg = /^[0-9.\/-]+$/
^ matches the start of the string
[...] matches if the character is contained in the group (i.e. any digit, ., / or -).
The / has to be escaped because it also denotes the end of a regex literal.
- between two characters describes a range of characters (between them, e.g. 0-9 or a-z). If - is at the beginning or end it has no special meaning though and is literally interpreted as hyphen.
+ is a quantifier and means "one or more if the preceding pattern". This allows us (together with the anchors) to test whether every character of the string is in the character class.
$ matches the end of the string
Alternatively, you can check whether there is any character that is not one of the allowed ones:
var reg = /[^0-9.\/-]/;
The ^ at the beginning of the character class negates it. Here we don't have to test every character of the string, because the existence of only character is different already invalidates the string.
You can use it like so:
if (reg.test(str)) { // !reg.test(str) for the first expression
// str contains an invalid character
}
Try this:
([0-9]{2}[/\-.]){2}[0-9]{4}
If you are not concerned about the validity of the date, you can easily use the regex:
^[0-9]{1,2}[./-][0-9]{1,2}[./-][0-9]{4}$
The character class [./-] allows any one of the characters within the square brackets and the quantifiers allow for either 1 or 2 digit months and dates, while only 4 digit years.
You can also group the first few groups like so:
^([0-9]{1,2}[./-]){2}[0-9]{4}$
Updated your fiddle with the first regex.

Categories

Resources