Java Script - Regular Expression matching a word in a string - javascript

I am trying to find a match in a string with JavaScript. I want to work with the RegEx function.
My example (what I have tried):
var str = "hello.you";
var patt1 = '\\b' + str + '\\b';
var result = str.match(patt1);
But this does not give me the result which I except. I just want to print "you".
Thanks all in advance.

So you jumped right into a pretty advanced regex topic. You sort of want to do a lookahead (the word AFTER a given boundary character). The following will get you there:
let str = "hello.you",
myRegex = /(?<=\.)\w+/;
let theWord = str.match(myRegex);
console.log(theWord[0]);
... And what that does, is uses (?<=.) to indicate "something that comes after a period", followed by \w+ to indicate a word.
I'd recommend using a regex tester, and build from that. I use https://www.regextester.com/

Related

How to remove characters from string using regex

I want to remove << any words #_ from the following string.
stringVal = "<<Start words#_ I <<love#_ kind <<man>>, <<john#_ <<kind man>> is really <<great>> <<end words#_ ";
Result mast be:
Start words I love kind <<man>>, john <<kind man>> is really <<great>> end words
I tried like this:
stringVal = stringVal.replace(/^.*<<.+\#_.*$/g, "");
But it removes all string.
Note: << any words #_ may exists multiple time in string, at the start, in the middle or at the end
Inferring from your examples, you might be looking for:
stringVal = "<<Start words#_ I <<love#_ kind <<man>>, <<john#_ <<kind man>> is really <<great>> <<end words#_ ";
stringVal = stringVal.replace(/<<([-\w ]+)#_/g, "$1");
console.log(stringVal);
To allow other characters, change the \w+ to e.g. [-\w ]+.
See an additional demo on regex101.com.
Instead of using .+\#_, and you want to match any words you could match word characters optionally repeated by matching a space space and word characters.
<<(\w+(?: \w+)*)#_
Regex demo
In the replacement use group 1 $1
Note that you don't have to escape #
const regex = /<<(\w+(?: \w+)*)#_/g;
stringVal = "I <<love#_ kind <<man>>, <<john#_ <<kind man>> is really <<great>>";
const result = stringVal.replace(regex, '$1');
console.log(result);

javascript regex to find only numbers with hyphen from a string content

In Javascript, from a string like this, I am trying to extract only the number with a hyphen. i.e. 67-64-1 and 35554-44-04. Sometimes there could be more hyphens.
The solvent 67-64-1 is not compatible with 35554-44-04
I tried different regex but not able to get it correctly. For example, this regex gets only the first value.
var msg = 'The solvent 67-64-1 is not compatible with 35554-44-04';
//var regex = /\d+\-?/;
var regex = /(?:\d*-\d*-\d*)/;
var res = msg.match(regex);
console.log(res);
You just need to add the g (global) flag to your regex to match more than once in the string. Note that you should use \d+, not \d*, so that you don't match something like '3--4'. To allow for processing numbers with more hyphens, we use a repeating -\d+ group after the first \d+:
var msg = 'The solvent 67-64-1 is not compatible with 23-35554-44-04 but is compatible with 1-23';
var regex = /\d+(?:-\d+)+/g;
var res = msg.match(regex);
console.log(res);
It gives only first because regex work for first element to test
// g give globel access to find all
var regex = /(?:\d*-\d*-\d*)/g;

remove last part of string following '&&&' with JavaScript Regex

I'm trying to use a regex in JS to remove the last part of a string. This substring starts with &&&, is followed by something not &&&, and ends with .pdf.
So, for example, the final regex should take a string like:
parent&&&child&&&grandchild.pdf
and match
parent&&&child
I'm not that great with regex's, so my best effort has been something like:
.*?(?:&&&.*\.pdf)
Which matches the whole string. Can anyone help me out?
You may use this greedy regex either in replace or in match:
var s = 'parent&&&child&&&grandchild.pdf';
// using replace
var r = s.replace(/(.*)&&&.*\.pdf$/, '$1');
console.log(r);
//=> parent&&&child
// using match
var m = s.match(/(.*)&&&.*\.pdf$/)
if (m) {
console.log(m[1]);
//=> parent&&&child
}
By using greedy pattern .* before &&& we make sure to match **last instance of &&& in input.
You want to remove the last portion, so replace it
var str = "parent&&&child&&&grandchild.pdf"
var result = str.replace(/&&&[^&]+\.pdf$/, '')
console.log(result)

using a lookahead to get the last occurrence of a pattern in javascript

I was able to build a regex to extract a part of a pattern:
var regex = /\w+\[(\w+)_attributes\]\[\d+\]\[own_property\]/g;
var match = regex.exec( "client_profile[foreclosure_defenses_attributes][0][own_property]" );
match[1] // "foreclosure_defenses"
However, I also have a situation where there will be a repetitive pattern like so:
"client_profile[lead_profile_attributes][foreclosure_defenses_attributes][0][own_property]"
In that case, I want to ignore [lead_profile_attributes] and just extract the portion of the last occurence as I did in the first example. In other words, I still want to match "foreclosure_defenses" in this case.
Since all patterns will be like [(\w+)_attributes], I tried to do a lookahead, but it is not working:
var regex = /\w+\[(\w+)_attributes\](?!\[(\w+)_attributes\])\[\d+\]\[own_property\]/g;
var match = regex.exec("client_profile[lead_profile_attributes][foreclosure_defenses_attributes][0][own_property]");
match // null
match returns null meaning that my regex isn't working as expected. I added the following:
\[(\w+)_attributes\](?!\[(\w+)_attributes\])
Because I want to match only the last occurrence of the following pattern:
[lead_profile_attributes][foreclosure_defenses_attributes]
I just want to grab the foreclosure_defenses, not the lead_profile.
What might I be doing wrong?
I think I got it working without positive lookahead:
regex = /(\[(\w+)_attributes\])+/
/(\[(\w+)_attributes\])+/
match = regex.exec(str);
["[a_attributes][b_attributes][c_attributes]", "[c_attributes]", "c"]
I was able to also achieve it through noncapturing groups. Output from chrome console:
var regex = /(?:\w+(\[\w+\]\[\d+\])+)(\[\w+\])/;
undefined
regex
/(?:\w+(\[\w+\]\[\d+\])+)(\[\w+\])/
str = "profile[foreclosure_defenses_attributes][0][properties_attributes][0][other_stuff]";
"profile[foreclosure_defenses_attributes][0][properties_attributes][0][other_stuff]"
match = regex.exec(str);
["profile[foreclosure_defenses_attributes][0][properties_attributes][0][other_stuff]", "[properties_attributes][0]", "[other_stuff]"]

Need a regex that finds "string" but not "[string]"

I'm trying to build a regular expression that parses a string and skips things in brackets.
Something like
string = "A bc defg hi [hi] jkl mnop.";
The .match() should return "hi" but not [hi]. I've spent 5 hours running through RE's but I'm throwing in the towel.
Also this is for javascript or jquery if that matters.
Any help is appreciated. Also I'm working on getting my questions formatted correctly : )
EDIT:
Ok I just had a eureka moment and figured out that the original RegExp I was using actually did work. But when I was replaces the matches with the [matches] it simply replaced the first match in the string... over and over. I thought this was my regex refusing to skip the brackets but after much time of trying almost all of the solutions below, I realized that I was derping Hardcore.
When .replace was working its magic it was on the first match, so I quite simply added a space to the end of the result word as follows:
var result = string.match(regex);
var modifiedResult = '[' + result[0].toString() + ']';
string.replace(result[0].toString() + ' ', modifiedResult + ' ');
This got it to stop targeting the original word in the string and stop adding a new set of brackets to it with every match. Thank you all for your help. I am going to give answer credit to the post that prodded me in the right direction.
preprocess the target string by removing everything between brackets before trying to match your RE
string = "A bc defg hi [hi] jkl mnop."
tmpstring = string.replace(/\[.*\]/, "")
then apply your RE to tmpstring
correction: made the match for brackets eager per nhahtd comment below, and also, made the RE global
string = "A bc defg hi [hi] jkl mnop."
tmpstring = string.replace(/\[.*?\]/g, "")
You don't necessarily need regex for this. Simply use string manipulation:
var arr = string.split("[");
var final = arr[0] + arr[1].split("]")[1];
If there are multiple bracketed expressions, use a loop:
while (string.indexOf("[") != -1){
var arr = string.split("[");
string = arr[0] + arr.slice(1).join("[").split("]").slice(1).join("]");
}
Using only Regular Expressions, you can use:
hi(?!])
as an example.
Look here about negative lookahead: http://www.regular-expressions.info/lookaround.html
Unfortunately, javascript does not support negative lookbehind.
I used http://regexpal.com/ to test, abcd[hi]jkhilmnop as test data, hi(?!]) as the regex to find. It matched 'hi' without matching '[hi]'. Basically it matched the 'hi' so long as there was not a following ']' character.
This of course, can be expanded if needed. This has a benefit of not requiring any pre-processing for the string.
r"\[(.*)\]"
Just play arounds with this if you wanto to use regular expressions.
What do yo uwant to do with it? If you want to selectively replace parts like "hi" except when it's "[hi]", then I often use a system where I match what I want to avoid first and then what I want to watch; if it matches what I want to avoid then I return the match, otherwise I return the processed match.
Like this:
return string.replace(/(\[\w+\])|(\w+)/g, function(all, m1, m2) {return m1 || m2.toUpperCase()});
which, with the given string, returns:
"A BC DEFG HI [hi] JKL MNOP."
Thus: it replaces every word with uppercase (m1 is empty), except if the word is between square brackets (m1 is not empty).
This builds an array of all the strings contained in [ ]:
var regex = /\[([^\]]*)\]/;
var string = "A bc defg hi [hi] [jkl] mnop.";
var results=[], result;
while(result = regex.exec(string))
results.push(result[1]);
edit
To answer to the question, this regex returns the string less all is in [ ], and trim whitespaces:
"A bc defg [hi] mnop [jkl].".replace(/(\s{0,1})\[[^\]]*\](\s{0,1})/g,'$1')
Instead of skipping the match you can probably try something different - match everything but do not capture the string within square brackets (inclusive) with something like this:
var r = /(?:\[.*?[^\[\]]\])|(.)/g;
var result;
var str = [];
while((result = r.exec(s)) !== null){
if(result[1] !== undefined){ //true if [string] matched but not captured
str.push(result[1]);
}
}
console.log(str.join(''));
The last line will print parts of the string which do not match the [string] pattern. For example, when called with the input "A [bc] [defg] hi [hi] j[kl]u m[no]p." the code prints "A hi ju mp." with whitespaces intact.
You can try different things with this code e.g. replacing etc.

Categories

Resources