Match a word with space beside

Match a word with space beside - javascript

I'm trying to match the following examples (javascript):
1.- "dog dogs"
R- match dog = true
2.- "dogsdogs"
R- match dog= false
3.- "cat dog dogs dogdogs dog"
R - match dog(twice) = true
4.- "cat dog$dog"
R- match dog= false
5.- "cat dog\ndog" OR "cat dog\sdog"
R- match dog(twice) = true
6.- "catdog dog $dog$dog dog"
R- math dog(twice) = true
I've just got this /\b(dog)\b/g but if i use this /^(dog)$/g just match one word
Thanks in advance

Try this:
/(^|\s)(dog)(?=\s|$)/gm
Tested via regexr - http://regexr.com?38gla
This matches a start of string or whitespace, then the word dog, then whitespace or end of string. The trailing whitespace/end of string is a positive lookahead, so its not consumed, allowing that space to be used for another match - ex "cat dog dog"

Related

Regex match any characters in string, up until next match

I have this string: title: one description: two
and want to split it into groups like [title: one, description: two]
options.match(/(title|description):.+?/gi)
this was my attempt, but it only captures up to the : and 1 space after, it does not include the text after it, which I want to include all of, up until the second match.

Split on a lookahead for title or description:
const str = 'title: one description: two';
console.log(
str.split(/ (?=title|description)/)
);

You could also get the matches with a capture group and match the whitespace in between
(\b(?:title|description):.+?)\s*(?=\b(?:title|description):|$)
The pattern matches:
( Capture group 1
\b(?:title|description): Match either title: or description: and :
.+? Match 1+ times any char no greedy (lazy)
) Close group 1
\s* Match optional whitespace chars
(?= Positive lookahead, assert what is at the right is
\b(?:title|description):|$ Match either title: or description: or assert the end of the string for the last item
) Close lookahead
Regex demo
const regex = /(\b(?:title|description):.+?)\s*(?=\b(?:title|description):|$)/gi;
let s = "title: one description: two";
console.log(Array.from(s.matchAll(regex), m => m[1]));

var str = "title: one description: two";
/* split with positive lookbehinds . A space must precede by all but : */
var res=str.split(/(?<=[^:])\s/);
console.log(res);
/* match general rule */
var res=str.match(/([^:\s]+:\s[^:\s]+)/gi);
console.log(res);
/* match with spacific words */
var res=str.match(/((title|description)+:\s[^:\s]+)/gi);
console.log(res);

Match sentences and whitespace separately

Take the following text:
This is a sentence. This is a sentence... This is a sentence! This is a sentence? This is a sentence.This is a sentence. This is a sentence
I'd like to match this so I have an array like the following:
[
"This is a sentence.",
" ",
"This is a sentence...",
" ",
"This is a sentence!",
" ",
"This is a sentence?",
" ",
"This is a sentence.",
"",
"This is a sentence.",
" ",
"This is a sentence",
]
With my current regex, however:
str.match(/[^.!?]+[.!?]*(\s*)/g);
I get the following:
[
"This is a sentence. ",
"This is a sentence... ",
"This is a sentence! ",
"This is a sentence? ",
"This is a sentence.",
"This is a sentence. ",
"This is a sentence"
]
How can I achieve this with JS ReExp?
Thanks in advance!

Just add [^\s] at the beginning and change (\s*) to |\s+.
The final regex will be like:
str.match(/[^\s][^.!?]+[.!?]*|\s+/g)
[^\s] will remove white spaces from the beginning of the expression
|\s+ will treat white spaces as a new expression

here is solution using you regex in the question, but doing some array spliting afterwards to keep the whitespaces in the array; essentially it will split the array by white spaces if they are in the end of the string ( positive lookahead of $ ) then flatting it again to achieve the exact output you want .
const baseStr = "This is a sentence. This is a sentence... This is a sentence! This is a sentence? This is a sentence.This is a sentence. This is a sentence";
var result = baseStr.match(/[^.!?]+[.!?]*(\s*)/g).map( str => str.split(/(\s*)(?=$)/).filter(_=>_)).flat();
console.log(result);

Regex to match full words, but no match at all on first failure

I'm looking for a JS regex to match full words, but not to match at all if there is any different word (any failure).
Eg: Match for \b(dog|cat)\b
cat dog cat --> everything is matched. OK.
dog --> dog is matched even if cat does not exist here. OK.
dog cata --> dog is matched, cata not. I don't want any match at all.

Is that ^(?:(?=.*\bdog\b)(?=.*\bcat\b).*|cat|dog)$ what you want?
Explanation:
^ : beginning of the string
(?: : start non capture group
(?=.*\bdog\b) : positive lookahead, zero-length assertion, make sure we have dog somewhere in the string
(?=.*\bcat\b) : positive lookahead, zero-length assertion, make sure we have cat somewhere in the string
.* : 0 or more any character
| : OR
cat : cat alone
| : OR
dog : dog alone
) : end group
$ : end of string
var test = [
'dog cat',
'cat dog',
'dog',
'cat',
'dog cata',
'cat fish',
];
console.log(test.map(function (a) {
return a + ' ==> ' + a.match(/^(?:(?=.*\bdog\b)(?=.*\bcat\b).*|cat|dog)$/);
}));

So, basically you want to check all of your words in your string matches the regexor all of your string should be from a list of string, isn't it? Let's split all the words and check whether all of them belongs from your list of strings.
var reg = /dog|cat|rat/,
input1 = "dog cat rat",
input2 = "dog cata rat",
input3 = "abcd efgh",
isMatched = s => !(s.match(/\S+/g) || []).some(e => !(new RegExp(e).test(reg)));
console.log(isMatched(input1));
console.log(isMatched(input2));
console.log(isMatched(input3));

javascript regexp to identify different components of a sentence

I have a very specific requirement. Consider the sentence "I am a robot X-rrt, I am 35 and my creator is 5-MAF. Everything here is 5 times than my world5 - hurray"
I am interested in a regexp which recognizes "I", "am", "a" , "robot", "X-rrt", ",", "I", "am", "35", "and", "my", "creator", "is", "5-MAF", ".", "Everthing", "here", "is", "5", "times", "than", "my", "world5", "-", "hurray"
i.e 1)it should recognize all punctuations except "-" when it a part of a word
2)numbers if part of a word containg alphabets should not be recognized seperately
I am extremely confused with this one. Would appreciate some advise!

Try splitting at each group of whitespaces, and before dots and commas:
str.split(/\s+|(?=[.,])/);

This is not too easy. I suggest some preprocession on the text before a split, for example:
var text = "I am a robot X-rrt, I am 35 and my creator is 5-MAF. Everything here is 5 times than my world5 - hurray";
var preprocessedText = text.replace(/(\w|^)(\W)( |$)/g, "$1 $2$3");
var tokens = preprocessedText.split(" ");
alert(tokens.join("\n"));

I tested this in perl. Shouldn't be too hard to translate to javascript.
my $sentence = 'I am a robot X-rrt, I am 35 and my creator is 5-MAF. Everything here is 5 times than my world5 - hurray';
my #words = split(/\s|(?<!-)\b(?!-)/, $sentence);
say "'" . join ("', '", #words) . "'";

Try this match regexp:
str.match(/[\w\d-]+|.|,/g);

Here is a solution that meets both your requirements:
/(?:\w|\b-\b)+|[^\w\s]+/g
See the regex demo.
Details:
(?:\w|\b-\b)+ - 1 or more
\w - word char
| - or
\b-\b - a hyphen in between word characters
| - or
[^\w\s]+ - 1 or more characters other than word and whitespace symbols.
See the JS demo below:
var s = "I am a robot X-rrt, I am 35 and my creator is 5-MAF. Everything here is 5 times than my world5 - hurray";
console.log(s.match(/(?:\w|\b-\b)+|[^\w\s]+/g));

Javascript replace string which doesn't match?

Say I have this string:
cat hates dog
When i do a replace :
str = str.replace('cat', 'fish');
I will only get "cat" replaced by "fish" , how to get it works like this:
"cat" replaced by "fish"
"other string"(else) replaced by "goat"
so I will get new string:
fish goat goat

You can use this regexp \b\w+?\b:
"cat hates dog".replace(/\b\w+?\b/g, function(a) {
return a === 'cat' ? 'fish' : 'goat';
});
It will match every word (sequence of word characters \w surrounded by word boundary \b) and pass match results in replace callback;
Output:
fish goat goat

Develop Reference

JavaScript is the programming language of the Web.

Match a word with space beside - javascript

Related

Regex match any characters in string, up until next match

Match sentences and whitespace separately

Regex to match full words, but no match at all on first failure

javascript regexp to identify different components of a sentence

Javascript replace string which doesn't match?

Categories

Resources