I'm trying to match the following examples (javascript):
1.- "dog dogs"
R- match dog = true
2.- "dogsdogs"
R- match dog= false
3.- "cat dog dogs dogdogs dog"
R - match dog(twice) = true
4.- "cat dog$dog"
R- match dog= false
5.- "cat dog\ndog" OR "cat dog\sdog"
R- match dog(twice) = true
6.- "catdog dog $dog$dog dog"
R- math dog(twice) = true
I've just got this /\b(dog)\b/g but if i use this /^(dog)$/g just match one word
Thanks in advance
Try this:
/(^|\s)(dog)(?=\s|$)/gm
Tested via regexr - http://regexr.com?38gla
This matches a start of string or whitespace, then the word dog, then whitespace or end of string. The trailing whitespace/end of string is a positive lookahead, so its not consumed, allowing that space to be used for another match - ex "cat dog dog"
Related
I have this string: title: one description: two
and want to split it into groups like [title: one, description: two]
options.match(/(title|description):.+?/gi)
this was my attempt, but it only captures up to the : and 1 space after, it does not include the text after it, which I want to include all of, up until the second match.
Split on a lookahead for title or description:
const str = 'title: one description: two';
console.log(
str.split(/ (?=title|description)/)
);
You could also get the matches with a capture group and match the whitespace in between
(\b(?:title|description):.+?)\s*(?=\b(?:title|description):|$)
The pattern matches:
( Capture group 1
\b(?:title|description): Match either title: or description: and :
.+? Match 1+ times any char no greedy (lazy)
) Close group 1
\s* Match optional whitespace chars
(?= Positive lookahead, assert what is at the right is
\b(?:title|description):|$ Match either title: or description: or assert the end of the string for the last item
) Close lookahead
Regex demo
const regex = /(\b(?:title|description):.+?)\s*(?=\b(?:title|description):|$)/gi;
let s = "title: one description: two";
console.log(Array.from(s.matchAll(regex), m => m[1]));
var str = "title: one description: two";
/* split with positive lookbehinds . A space must precede by all but : */
var res=str.split(/(?<=[^:])\s/);
console.log(res);
/* match general rule */
var res=str.match(/([^:\s]+:\s[^:\s]+)/gi);
console.log(res);
/* match with spacific words */
var res=str.match(/((title|description)+:\s[^:\s]+)/gi);
console.log(res);
Take the following text:
This is a sentence. This is a sentence... This is a sentence! This is a sentence? This is a sentence.This is a sentence. This is a sentence
I'd like to match this so I have an array like the following:
[
"This is a sentence.",
" ",
"This is a sentence...",
" ",
"This is a sentence!",
" ",
"This is a sentence?",
" ",
"This is a sentence.",
"",
"This is a sentence.",
" ",
"This is a sentence",
]
With my current regex, however:
str.match(/[^.!?]+[.!?]*(\s*)/g);
I get the following:
[
"This is a sentence. ",
"This is a sentence... ",
"This is a sentence! ",
"This is a sentence? ",
"This is a sentence.",
"This is a sentence. ",
"This is a sentence"
]
How can I achieve this with JS ReExp?
Thanks in advance!
Just add [^\s] at the beginning and change (\s*) to |\s+.
The final regex will be like:
str.match(/[^\s][^.!?]+[.!?]*|\s+/g)
[^\s] will remove white spaces from the beginning of the expression
|\s+ will treat white spaces as a new expression
here is solution using you regex in the question, but doing some array spliting afterwards to keep the whitespaces in the array; essentially it will split the array by white spaces if they are in the end of the string ( positive lookahead of $ ) then flatting it again to achieve the exact output you want .
const baseStr = "This is a sentence. This is a sentence... This is a sentence! This is a sentence? This is a sentence.This is a sentence. This is a sentence";
var result = baseStr.match(/[^.!?]+[.!?]*(\s*)/g).map( str => str.split(/(\s*)(?=$)/).filter(_=>_)).flat();
console.log(result);
I'm looking for a JS regex to match full words, but not to match at all if there is any different word (any failure).
Eg: Match for \b(dog|cat)\b
cat dog cat --> everything is matched. OK.
dog --> dog is matched even if cat does not exist here. OK.
dog cata --> dog is matched, cata not. I don't want any match at all.
Is that ^(?:(?=.*\bdog\b)(?=.*\bcat\b).*|cat|dog)$ what you want?
Explanation:
^ : beginning of the string
(?: : start non capture group
(?=.*\bdog\b) : positive lookahead, zero-length assertion, make sure we have dog somewhere in the string
(?=.*\bcat\b) : positive lookahead, zero-length assertion, make sure we have cat somewhere in the string
.* : 0 or more any character
| : OR
cat : cat alone
| : OR
dog : dog alone
) : end group
$ : end of string
var test = [
'dog cat',
'cat dog',
'dog',
'cat',
'dog cata',
'cat fish',
];
console.log(test.map(function (a) {
return a + ' ==> ' + a.match(/^(?:(?=.*\bdog\b)(?=.*\bcat\b).*|cat|dog)$/);
}));
So, basically you want to check all of your words in your string matches the regexor all of your string should be from a list of string, isn't it? Let's split all the words and check whether all of them belongs from your list of strings.
var reg = /dog|cat|rat/,
input1 = "dog cat rat",
input2 = "dog cata rat",
input3 = "abcd efgh",
isMatched = s => !(s.match(/\S+/g) || []).some(e => !(new RegExp(e).test(reg)));
console.log(isMatched(input1));
console.log(isMatched(input2));
console.log(isMatched(input3));
I have a very specific requirement. Consider the sentence "I am a robot X-rrt, I am 35 and my creator is 5-MAF. Everything here is 5 times than my world5 - hurray"
I am interested in a regexp which recognizes "I", "am", "a" , "robot", "X-rrt", ",", "I", "am", "35", "and", "my", "creator", "is", "5-MAF", ".", "Everthing", "here", "is", "5", "times", "than", "my", "world5", "-", "hurray"
i.e 1)it should recognize all punctuations except "-" when it a part of a word
2)numbers if part of a word containg alphabets should not be recognized seperately
I am extremely confused with this one. Would appreciate some advise!
Try splitting at each group of whitespaces, and before dots and commas:
str.split(/\s+|(?=[.,])/);
This is not too easy. I suggest some preprocession on the text before a split, for example:
var text = "I am a robot X-rrt, I am 35 and my creator is 5-MAF. Everything here is 5 times than my world5 - hurray";
var preprocessedText = text.replace(/(\w|^)(\W)( |$)/g, "$1 $2$3");
var tokens = preprocessedText.split(" ");
alert(tokens.join("\n"));
I tested this in perl. Shouldn't be too hard to translate to javascript.
my $sentence = 'I am a robot X-rrt, I am 35 and my creator is 5-MAF. Everything here is 5 times than my world5 - hurray';
my #words = split(/\s|(?<!-)\b(?!-)/, $sentence);
say "'" . join ("', '", #words) . "'";
Try this match regexp:
str.match(/[\w\d-]+|.|,/g);
Here is a solution that meets both your requirements:
/(?:\w|\b-\b)+|[^\w\s]+/g
See the regex demo.
Details:
(?:\w|\b-\b)+ - 1 or more
\w - word char
| - or
\b-\b - a hyphen in between word characters
| - or
[^\w\s]+ - 1 or more characters other than word and whitespace symbols.
See the JS demo below:
var s = "I am a robot X-rrt, I am 35 and my creator is 5-MAF. Everything here is 5 times than my world5 - hurray";
console.log(s.match(/(?:\w|\b-\b)+|[^\w\s]+/g));
Say I have this string:
cat hates dog
When i do a replace :
str = str.replace('cat', 'fish');
I will only get "cat" replaced by "fish" , how to get it works like this:
"cat" replaced by "fish"
"other string"(else) replaced by "goat"
so I will get new string:
fish goat goat
You can use this regexp \b\w+?\b:
"cat hates dog".replace(/\b\w+?\b/g, function(a) {
return a === 'cat' ? 'fish' : 'goat';
});
It will match every word (sequence of word characters \w surrounded by word boundary \b) and pass match results in replace callback;
Output:
fish goat goat