Regex for Accepting any number of Characters and - - javascript

Can any one give me correct Regular Expression to accept Name starting with characters and ending with characters, in middle i can allow -
Ex: ABC or ABC-DEF or ABC-DEF-GHI
But it should not start or end with -
Ex: -ABC or -ABC- or ABC-
Here is my Regular Expression:
var regex = /^[A-Za-z]-?[A-Za-z]*-?([A-Za-z]+)+$/
This works perfactly fine for me, but if suppose i want to give name as AB-CD-EF-GH than this don't work.
Note: Remember that Name should start with Characters and End with Characters and in between i can have - but not -- twice. It has to be associated with characters like a-b-c-d

^[A-Za-z]+(?:-[A-Za-z]+)*$
This simple regex will do it for you.See demo.
https://regex101.com/r/sJ9gM7/55
var re = /^[A-Za-z]+(?:-[A-Za-z]+)*$/gim;
var str = 'ABC\nABC-DEF\n-ABC\nABC-\nAB-CD-EF-GH\n';
var m;
if ((m = re.exec(str)) !== null) {
if (m.index === re.lastIndex) {
re.lastIndex++;
}
// View your result using the m-variable.
// eg m[0] etc.
}

I believe this is what you want :
/^[A-Z](-[A-Z]+)*[A-Z]$/i
Analysis :
/^ Start of string
[A-Z] Any alphabetic character
( Group
- A hyphen character
[A-Z]+ One or more alphabetic characters
)* 0 or more repititons of group
[A-Z] Any alphabetic character
$/i End of string, allow upper or lower case alpha

You can use .* inside to allow any number of any characters except for a newline:
var regex = /^(?!.*--.*$)[A-Za-z].*[A-Za-z]$/
(?!.*--.*$) will make sure double hyphens are not allowed.
Please check the regex demo here.
function ValIt(str)
{
var re = /^(?!.*?--.*?$)[A-Za-z].*[A-Za-z]$/g;
if ((m = re.exec(str)) !== null)
return true;
else
return false;
}
document.getElementById("res").innerHTML = 'RTT--rrtr: ' + ValIt('RTT--rrtr') + "<br>ER#$-wrwr: "+ ValIt('ER#$-wrwr') + "<br>Rfff-ff-d: " + ValIt('Rfff-ff-d');
<div id=res />

var rgx = /^[A-Za-z]+(-[A-Za-z]+)*$/;
rgx.test("ABC"); // true
rgx.test("ABC-DEF"); // true
rgx.test("AB-CD-EF-GH"); // true
rgx.test("-AB-CD-EF-GH"); // false
rgx.test("AB-CD-"); // false
rgx.test("AB--CD"); // false

Related

Regex replace all character except last 5 character and whitespace with plus sign

I wanted to replace all characters except its last 5 character and the whitespace with +
var str = "HFGR56 GGKDJ JGGHG JGJGIR"
var returnstr = str.replace(/\d+(?=\d{4})/, '+');
the result should be "++++++ ++++ +++++ JGJGIR" but in the above code I don't know how to exclude whitespace
You need to match each character individually, and you need to allow a match only if more than six characters of that type follow.
I'm assuming that you want to replace alphanumeric characters. Those can be matched by \w. All other characters will be matched by \W.
This gives us:
returnstr = str.replace(/\w(?=(?:\W*\w){6})/g, "+");
Test it live on regex101.com.
The pattern \d+(?=\d{4}) does not match in the example string as is matches 1+ digits asserting what is on the right are 4 digits.
Another option is to match the space and 5+ word characters till the end of the string or match a single word character in group 1 using an alternation.
In the callback of replace, return a + if you have matched group 1, else return the match.
\w{5,}$|(\w)
Regex demo
let pattern = / \w{5,}$|(\w)/g;
let str = "HFGR56 GGKDJ JGGHG JGJGIR"
.replace(pattern, (m, g1) => g1 ? '+' : m);
console.log(str);
Another way is to replace a group at a time where the number of +
replaced is based on the length of the characters matched:
var target = "HFGR56 GGKDJ JGGHG JGJGIR";
var target = target.replace(
/(\S+)(?!$|\S)/g,
function( m, g1 )
{
var len = parseInt( g1.length ) + 1;
//return "+".repeat( len ); // Non-IE (quick)
return Array( len ).join("+"); // IE (slow)
} );
console.log ( target );
You can use negative lookahead with string end anchor.
\w(?!\w{0,5}$)
Match any word character which is not followed by 0 to 5 characters and end of string.
var str = "HFGR56 GGKDJ JGGHG JGJGIR"
var returnstr = str.replace(/\w(?!\w{0,5}$)/g, '+');
console.log(returnstr)

Regex - ignoring text between quotes / HTML(5) attribute filtering

So I have this Regular expression, which basically has to filter the given string to a HTML(5) format list of attributes. It currently isn't doing my fulfilling, but that's about to change! (I hope so)
I'm trying to achieve that whenever an occurrence is found, it selects the text until the next occurrence OR the end of the string, as the second match. So if you'd take a look at the current regular expression:
/([a-zA-Z]+|[a-zA-Z]+-[a-zA-Z0-9]+)=["']/g
A string like this: hey="hey world" hey-heyhhhhh3123="Hello world" data-goed="hey"
Would be filtered / matched out like this:
MATCH 1. [0-3] `hey`
MATCH 2. [16-32] `hey-heyhhhhh3123`
MATCH 3. [47-56] `data-goed`
This has to be seen as the attribute-name(s), and now.. we just have to fetch the attribute's value(s). So the mentioned string has to have an outcome like this:
MATCH 1.
1 [0-3] `hey`
2 [6-14] `hey world`
MATCH 2.
1 [16-32] `hey-heyhhhhh3123`
2 [35-45] `Hello world`
MATCH 3.
1 [47-56] `data-goed`
2 [59-61] `hey`
Could anyone try and help me to get my fulfilling? It would be appericiated a lot!
You can use
/([^\s=]+)=(?:"([^"\\]*(?:\\.[^"\\]*)*)"|(\S+))/g
See regex demo
Pattern details:
([^\s=]+) - Group 1 capturing 1 or more characters other than whitespace and = symbol
= - an equal sign
(?:"([^"\\]*(?:\\.[^"\\]*)*)"|(\S+)) - a non-capturing group of 2 alternatives (one more '([^'\\]*(?:\\.[^'\\]*)*)' alternative can be added to account for single quoted string literals)
"([^"\\]*(?:\\.[^"\\]*)*)" - a double quoted string literal pattern:
" - a double quote
([^"\\]*(?:\\.[^"\\]*)*) - Group 2 capturing 0+ characters other than \ and ", followed with 0+ sequences of any escaped symbol followed with 0+ characters other than \ and "
" - a closing dlouble quote
| - or
(\S+) - Group 3 capturing one or more non-whitespace characters
JS demo (no single quoted support):
var re = /([^\s=]+)=(?:"([^"\\]*(?:\\.[^"\\]*)*)"|(\S+))/g;
var str = 'hey="hey world" hey-heyhhhhh3123="Hello \\"world\\"" data-goed="hey" more=here';
var res = [];
while ((m = re.exec(str)) !== null) {
if (m[3]) {
res.push([m[1], m[3]]);
} else {
res.push([m[1], m[2]]);
}
}
console.log(res);
JS demo (with single quoted literal support)
var re = /([^\s=]+)=(?:"([^"\\]*(?:\\.[^"\\]*)*)"|'([^'\\]*(?:\\.[^'\\]*)*)'|(\S+))/g;
var str = 'pseudoprefix-before=\'hey1"\' data-hey="hey\'hey" more=data and="more \\"here\\""';
var res = [];
while ((m = re.exec(str)) !== null) {
if (m[2]) {
res.push([m[1], m[2]])
} else if (m[3]) {
res.push([m[1], m[3]])
} else if (m[4]) {
res.push([m[1], m[4]])
}
}
console.log(res);

Javascript regexp capture matches delimited by character

I have a string like
classifier1:11:some text1##classifier2:fdglfgfg##classifier3:fgdfgfdg##classifier4
I am trying to capture terms like classifier1:11, classifier2:, classifier3 and classifier4
So these classifiers can be followed by a single semicolon or not.
So far I came up with
/([^#]*)(?::(?!:))/g
But that does not seem to capture classifier4, not sure what I am missing here
It seems that a classifier in your case consists of any word chars that may have single : in between and ends with a digit.
Thus, you may use
/(\w+(?::+\w+)*\d)[^#]*/g
See the regex demo
Explanation:
(\w+(?::+\w+)*\d) - Group 1 capturing
\w+ - 1 or more [a-zA-Z0-9_] (word) chars
(?::+\w+)* - zero or more sequences of 1+ :s and then 1+ word chars
\d - a digit should be at the end of this group
[^#]* - zero or more characters other than the delimiter #.
JS:
var re = /(\w+(?::+\w+)*\d)[^#\n]*/g;
var str = 'classifier4##classifier1:11:some text1##classifier2:fdglfgfg##classifier3:fgdfgfdg\nclassifier1:11:some text1##classifier4##classifier2:fdglfgfg##classifier3:fgdfgfdg##classifier4';
var res = [];
while ((m = re.exec(str)) !== null) {
res.push(m[1]);
}
document.body.innerHTML = "<pre>" + JSON.stringify(res, 0, 4) + "</pre>";
Basing on your pattern you can use a regex like this:
([^#]*)(?::|$)
Working demo

regex to match all words but AND, OR and NOT

In my javascript app I have this random string:
büert AND NOT 3454jhadf üasdfsdf OR technüology AND (bar OR bas)
and i would like to match all words special chars and numbers besides the words AND, OR and NOT.
I tried is this
/(?!AND|OR|NOT)\b[\u00C0-\u017F\w\d]+/gi
which results in
["büert", "3454jhadf", "asdfsdf", "technüology", "bar", "bas"]
but this one does not match the ü or any other letter outside the a-z alphabet at the beginning or at the end of a word because of the \b word boundary.
removing the \b oddly ends up matching part or the words i would like to exclude:
/(?!AND|OR|NOT)[\u00C0-\u017F\w\d]+/gi
result is
["büert", "ND", "OT", "3454jhadf", "üasdfsdf", "R", "technüology", "ND", "bar", "R", "bas"]
what is the correct way to match all words no matter what type of characters they contain besides the ones i want exclude?
The issue here has its roots in the fact that \b (and \w, and other shorthand classes) are not Unicode-aware in JavaScript.
Now, there are 2 ways to achieve what you want.
1. SPLIT WITH PATTERN(S) YOU WANT TO DISCARD
var re = /\s*\b(?:AND|OR|NOT)\b\s*|[()]/;
var s = "büert AND NOT 3454jhadf üasdfsdf OR technüology AND (bar OR bas)";
var res = s.split(re).filter(Boolean);
document.body.innerHTML += JSON.stringify(res, 0, 4);
// = > [ "büert", "3454jhadf üasdfsdf", "technüology", "bar", "bas" ]
Note the use of a non-capturing group (?:...) so as not to include the unwanted words into the resulting array. Also, you need to add all punctuation and other unwanted characters to the character class.
2. MATCH USING CUSTOM BOUNDARIES
You can use groupings with anchors/reverse negated character class in a regex like this:
(^|[^\u00C0-\u017F\w])(?!(?:AND|OR|NOT)(?=[^\u00C0-\u017F\w]|$))([\u00C0-\u017F\w]+)(?=[^\u00C0-\u017F\w]|$)
The capure group 2 will hold the values you need.
See regex demo
JS code demo:
var re = /(^|[^\u00C0-\u017F\w])(?!(?:AND|OR|NOT)(?=[^\u00C0-\u017F\w]|$))([\u00C0-\u017F\w]+)(?=[^\u00C0-\u017F\w]|$)/gi;
var str = 'büert AND NOT 3454jhadf üasdfsdf OR technüology AND (bar OR bas)';
var m;
var arr = [];
while ((m = re.exec(str)) !== null) {
arr.push(m[2]);
}
document.body.innerHTML += JSON.stringify(arr);
or with a block to build the regex dynamically:
var bndry = "[^\\u00C0-\\u017F\\w]";
var re = RegExp("(^|" + bndry + ")" + // starting boundary
"(?!(?:AND|OR|NOT)(?=" + bndry + "|$))" + // restriction
"([\\u00C0-\\u017F\\w]+)" + // match and capture our string
"(?=" + bndry + "|$)" // set trailing boundary
, "g");
var str = 'büert AND NOT 3454jhadf üasdfsdf OR technüology AND (bar OR bas)';
var m, arr = [];
while ((m = re.exec(str)) !== null) {
arr.push(m[2]);
}
document.body.innerHTML += JSON.stringify(arr);
Explanation:
(^|[^\u00C0-\u017F\w]) - our custom boundary (match a string start with ^ or any character outside the [\u00C0-\u017F\w] range)
(?!(?:AND|OR|NOT)(?=[^\u00C0-\u017F\w]|$)) - a restriction on the match: the match is failed if there are AND or OR or NOT followed by string end or characters other than those in the \u00C0-\u017F range or non-word character
([\u00C0-\u017F\w]+) - match word characters ([a-zA-Z0-9_]) or those from the \u00C0-\u017F range
(?=[^\u00C0-\u017F\w]|$) - the trailing boundary, either string end ($) or characters other than those in the \u00C0-\u017F range or non-word character.

String that doesn't contain character group

I wrote regex for finding urls in text:
/(http[^\s]+)/g
But now I need same as that but that expression doesn't contain certain substring, for instance I want all those urls which doesn't contain word google.
How can I do that?
Here is a way to achieve that:
http:\/\/(?!\S*google)\S+
See demo
JS:
var re = /http:\/\/(?!\S*google)\S+/g;
var str = 'http://ya.ru http://yahoo.com http://google.com';
var m;
while ((m = re.exec(str)) !== null) {
document.getElementById("r").innerHTML += m[0] + "<br/>";
}
<div id="r"/>
Regex breakdown:
http:\/\/ - a literal sequence of http://
(?!\S*google) - a negative look-ahead that performs a forward check from the current position (i.e. right after http://), and if it finds 0-or-more-non-spaces-heregoogle the match will be cancelled.
\S+ - 1 or more non-whitespace symbols (this is necessary since the lookahead above does not really consume the characters it matches).
Note that if you have any punctuation after the URL, you may add \b right at the end of the pattern:
var re1 = /http:\/\/(?!\S*google)\S+/g;
var re2 = /http:\/\/(?!\S*google)\S+\b/g;
document.write(
JSON.stringify(
'http://ya.ru, http://yahoo.com, http://google.com'.match(re1)
) + "<br/>"
);
document.write(
JSON.stringify(
'http://ya.ru, http://yahoo.com, http://google.com'.match(re2)
)
);

Categories

Resources