Regex for testing if commas are missing from a string

Regex for testing if commas are missing from a string - javascript

I'm trying to check that a list of items is entered properly and includes a comma between each entry. In this list there can only be a single word and after every word there must be a comma.
I'm attempting to use a lookbehind to assert that there is a comma before every space, but it seems to only work for the first occurrence of the character. How can I look through the entire string?
const nameStringList = "Fozzie, Gonzo, Kermit Animal "
const isValid = /\s+/.test(nameStringList) && !(/(?<=,)\s.*/.test(nameStringList))
console.log(isValid);

/^(\S+(,\s|$))+$/
Explanation:
Match one or more non-whitespace characters followed by either a comma and a whitespace character or the end of the message. This should be repeated at least once but can be repeated more times. This should match from the start to the end of the message, so if part of the string doesn't match then it won't work.

Related

Regex match first character once, followed by repetitive matching until end

I'm trying to match characters that shouldn't be allowed in a username string to then be replaced.
Anything outside this range should match first character [a-zA-Z] <-- restricting the first character is causing problems and I don't know how to fix it
And then match everything else outside this range [0-9a-zA-Z_.] <---- repeat until the end of the string
Matches:
/////hey/// <-- first match /////, second match ///
[][123Bc_.// <-- first match [][, second match //
(/abc <-- should match (/
a2__./) <-- should match /)
Non Matches:
a_____
b__...
Current regex
/^([^a-zA-Z])([^\w.])*/
const regex = /^([^a-zA-Z])([^0-9a-zA-Z_.])*/;
'(/abc'.replace(regex, '') // => return expected abc
'/////hey///'.replace(regex, '') // => return expected "hey"

/^([^a-zA-Z])([^\w.])*/
You can not do it this way, with negated character classes and the pattern anchored at the start. For example for your va2__./), this of course won’t match - because the first character is not in the disallowed range, so the whole expression doesn’t match.
Your allowed characters for the first position are a subset, of what you want to allow for “the rest” - so do that second part first, replace everything that does not match [0-9a-zA-Z_.] with an empty string, without anchoring the pattern at the beginning or end.
And then, in the result of that operation, replace any characters not matching [a-zA-Z] from the start. (So that second pattern does get anchored at the beginning, and you’ll want to use + as quantifier - because when you remove the first invalid character, the next one becomes the new first, and that one might still be invalid.)

Unable to match regex for any character except ' and "

I've written a regex to match against the string
{{AB.group.one}}:"eighth",{{AB.group.TWO}}:"third",{{attr1111}}:"fourth","fifth":{{attr_22_2qq2}},"sixth":{{AB.group.three}},{{ab.group.fourth}}:"seventh","ninth":{{attr1111}}}
Regex:
/[^'"]({{2}[a-zA-Z0-9$_].*?}{2})[^'"]/gi
Breaking the regex above:
[^'"]: Start with a character which is neither ' nor ".
({{2}[a-zA-Z0-9$_].*?}{2}): Have exactly 2 {{, then any character in the range a-zA-Z0-9$_ . After that, exactly 2 }}
[^'"]: Any character except for ' and ".
Below matches are not the exact matches but the captured groups. I'll perform my operations on the captured groups so for simplicity, we can consider them as our matches.
Expected matches:
{{AB.group.one}}
{{AB.group.TWO}}
{{attr1111}}
{{attr_22_2qq2}}
{{AB.group.three}}
{{ab.group.fourth}}
{{attr1111}}}
Resultant matches:
{{AB.group.TWO}}
{{attr1111}}
{{attr_22_2qq2}}
{{AB.group.three}}
{{attr1111}}}
As you can see in the image below {{AB.group.one}} and {{ab.group.fourth}} do not match. I want them to match them as well.
I know the reasons why they aren't matching.
The reason why {{AB.group.one}} doesn't match is because [^'"] expects one character except for ' and " and I'm not providing one. If I replace [^'"] with ["'"]*, it'll work but in that case "{{AB.group.one}}" will match as well.
So, the problem statement is match any character(if there's any) before {{ and after }} but the character can't be ' or ".
The reason why {{ab.group.fourth}} doesn't match is because the character preceding this match i.e. , is part of another match. This is just my speculation, the reason could be something else. But if I include any character between {{AB.group.three}}, and {{ab.group.fourth}} (e.g. {{AB.group.three}}, {{ab.group.fourth}}), then the pattern matches. I have no idea how can I fix this.
Please help me in solving these two problems. Thank you.

Here is a regex based approach which seems to be working. First, we can string off all double-quoted terms, then replace islands of comma/colon with just a single comma separator. Finally, split on comma to generate an array of terms.
var input = "{{AB.group.one}}:\"eighth\",{{AB.group.TWO}}:\"third\",{{attr1111}}:\"fourth\",\"fifth\":{{attr_22_2qq2}},\"sixth\":{{AB.group.three}},{{ab.group.fourth}}:\"seventh\",\"ninth\":{{attr1111}}},\"blah\":\"stuff\",{{one}}:{{two}}";
var terms = input.replace(/\".*?\"/g, "").replace(/[,:]+/g, ",").split(",");
console.log(terms);

You were actually really close with what you had.
let input = '{{AB.group.one}}:"eighth",{{AB.group.TWO}}:"third",{{attr1111}}:"fourth","fifth":{{attr_22_2qq2}},"sixth":{{AB.group.three}},{{ab.group.fourth}}:"seventh","ninth":{{attr1111}}}'
let regex = /(?<=[^'"]?)({{2}[a-zA-Z0-9$_].*?}{2})(?=[^'"]?)/gi;
console.log(input.match(regex))
(?<=[^'"]?) is a positive lookbehind. Since the negated character set is used, we're checking that the character before the match is not ' or ". The question mark makes this optional - match zero or one of the previous token (the negated character set).
(?=[^'"]?) is a positive lookahead and checks the token immediately after the expression to ensure that it's not a ' or " (or that there is no token after the expression).
Another option, since lookbehinds aren't supported in every browser:
let input = '{{AB.group.one}}:"eighth",{{AB.group.TWO}}:"third",{{attr1111}}:"fourth","fifth":{{attr_22_2qq2}},"sixth":{{AB.group.three}},{{ab.group.fourth}}:"seventh","ninth":{{attr1111}}}'
let regex = /(?:[^{'"])?({{2}[a-zA-Z0-9$_].*?}{2})(?:[^}'"])?/gi
console.log([...input.matchAll(regex)].map(reg => reg[1]))
String.match() loses reference to capture groups when the global flag is passed, so only returns the "match". Since you're creating a capture group with ({{2}[a-zA-Z0-9$_].*?}{2}), if you wanted to just ensure the characters immediately surrounding the bracketed expression aren't quotation marks, you can just use non-capture groups for those optional checks.
(?:[^{'"])? is a non-capturing group, as is (?:[^}'"])?
Using String.matchAll, the first element of the arrays created for each match is the entire match, the second element is the first capturing group, etc. So the logic for mapping over [...input.matchAll(regex)] is just to collect the capturing group from each match.

Filter characters in this RegEx

I have this regular expression to match a valid name: /^['"\s\-.*0-9\u00BF-\u1FFF\u2C00-\uD7FF\w]+$/.test(name)
I'm having trouble figuring out how to transform this match style regex into one designed to filter out invalid characters using replace.
Ideally I would like to be able to take an invalid name in name, run it through the replace to replace any invalid characters, and then have the original test return true no matter what (as invalid characters will be filtered out).

Just use a negated character class by adding a ^ in front:
name.replace(/[^'"\s\-.*0-9\u00BF-\u1FFF\u2C00-\uD7FF\w]/g, "")
Example:
var name = "'41%!\u2000abc";
var sanitized = name.replace(/[^'"\s\-.*0-9\u00BF-\u1FFF\u2C00-\uD7FF\w]/g, "");
console.log(/^['"\s\-.*0-9\u00BF-\u1FFF\u2C00-\uD7FF\w]+$/.test(name)); // false
console.log(/^['"\s\-.*0-9\u00BF-\u1FFF\u2C00-\uD7FF\w]+$/.test(sanitized)); // true

/^['"\s\-.*0-9\u00BF-\u1FFF\u2C00-\uD7FF\w]+$/
The + at the end tells you to match a at least 1 or multiple characters of the types inside the brackets. The ^ at the beginning in combination with the $ at the end tells to match the whole input from its start to its end. So given regex matches a string consisting of only the characters of the set.
What you want is this:
/[^'"\s\-.*0-9\u00BF-\u1FFF\u2C00-\uD7FF\w]/g
[^] means to NOT match whatever is inside the brackets and is the opposite of [].

Regular expression to check contains only

EDIT: Thank you all for your inputs. What ever you answered was right.But I thought I didnt explain it clear enough.
I want to check the input value while typing itself.If user is entering any other character that is not in the list the entered character should be rolled back.
(I am not concerning to check once the entire input is entered).
I want to validate a date input field which should contain only characters 0-9[digits], -(hyphen) , .(dot), and /(forward slash).Date may be like 22/02/1999 or 22.02.1999 or 22-02-1999.No validation need to be done on either occurrence or position. A plain validation is enough to check whether it has any other character than the above listed chars.
[I am not good at regular expressions.]
Here is what I thought should work but not.
var reg = new RegExp('[0-9]./-');
Here is jsfiddle.

Your expression only tests whether anywhere in the string, a digit is followed by any character (. is a meta character) and /-. For example, 5x/- or 42%/-foobar would match.
Instead, you want to put all the characters into the character class and test whether every single character in the string is one of them:
var reg = /^[0-9.\/-]+$/
^ matches the start of the string
[...] matches if the character is contained in the group (i.e. any digit, ., / or -).
The / has to be escaped because it also denotes the end of a regex literal.
- between two characters describes a range of characters (between them, e.g. 0-9 or a-z). If - is at the beginning or end it has no special meaning though and is literally interpreted as hyphen.
+ is a quantifier and means "one or more if the preceding pattern". This allows us (together with the anchors) to test whether every character of the string is in the character class.
$ matches the end of the string
Alternatively, you can check whether there is any character that is not one of the allowed ones:
var reg = /[^0-9.\/-]/;
The ^ at the beginning of the character class negates it. Here we don't have to test every character of the string, because the existence of only character is different already invalidates the string.
You can use it like so:
if (reg.test(str)) { // !reg.test(str) for the first expression
// str contains an invalid character
}

Try this:
([0-9]{2}[/\-.]){2}[0-9]{4}

If you are not concerned about the validity of the date, you can easily use the regex:
^[0-9]{1,2}[./-][0-9]{1,2}[./-][0-9]{4}$
The character class [./-] allows any one of the characters within the square brackets and the quantifiers allow for either 1 or 2 digit months and dates, while only 4 digit years.
You can also group the first few groups like so:
^([0-9]{1,2}[./-]){2}[0-9]{4}$
Updated your fiddle with the first regex.

Remove entire word from string if it contains numeric value

What I'm trying to accomplish is to auto-generate tags/keywords for a file upload, basing these keywords from the filename.
I have accomplished auto-generating titles for each upload, as shown here:
But I have now moved on to trying to auto-generate keywords. Similar to titles, but with more formatting. First, I run the string through this to remove commonly used words from the filename (such as this,that,there... etc)
I am happy with it, but I need to not include words that have numbers in it. I have not found a solution on how to remove a word entirely if it contains a number. The solutions I have found like here only works for a certain match, while this one removes numbers alone. I would like to remove the entire word if it contains ANY numeric digit.

To remove all words which contain a number, use:
string = string.replace(/[a-z]*\d+[a-z]*/gi, '');

Try this expression:
var regex = /\b[^\s]*\d[^\s]*\b/g;
Example:
var str = "normal 5digit dig555it digit5 555";
console.log( str.replace(regex,'') ); //Result-> normal

Apply a simple regular expression to you current filename strings, replacing all occurrences with the empty string. The regular expression matches "words" containing any digits.
Javascript example:
'asdf 8bit jawesome234 mayhem 234'.replace(/\s*\b\w*\d\w*\b/g, '')
Evaluates to:
"asdf mayhem"
Here the regular expression is /\s*\b\w*\d\w*\b/g, which matches maximal sequences consisting of zero or more whitespace characters (\s*) followed by a word-boundary transition (\b), followed by zero or more alphanum characters (\w*), followed by a digit (\d), followed by zero or more alphanum characters, followed by a word-boundary transition (\b). \b matches the empty string at the transition to an alphanumeric character from either the beginning or end of the word or a non-alphanumeric character. The g after the final / of the regular expression means replace all occurrences, not just the first.
Once the digit-words are removed, you can split the string into keywords however you want (by whitespace, for example).
"asdf mayhem".split(/\s+/);
Evaluates to:
["asdf", "mayhem"]

('Apple Cover Photo 23s423 of your 543634 moms').match(/\b([^\d]+)\b/g, '')
returns
Apple Cover Photo , of your , moms
http://jsfiddle.net/awBPX/2/

use this to Remove words containing numeric :
string.replace("[0-9]","");
hope this helps.
Edited :
check this :
var str = 'one 2two three3 fo4ur 5 six';
var result = str.match(/(^[\D]+\s|\s[\D]+\s|\s[\D]+$|^[\D]+$)+/g).join('');

Develop Reference

JavaScript is the programming language of the Web.

Regex for testing if commas are missing from a string - javascript

Related

Regex match first character once, followed by repetitive matching until end

Unable to match regex for any character except ' and "

Filter characters in this RegEx

Regular expression to check contains only

Remove entire word from string if it contains numeric value

Categories

Resources