remove last part of string following '&&&' with JavaScript Regex - javascript

I'm trying to use a regex in JS to remove the last part of a string. This substring starts with &&&, is followed by something not &&&, and ends with .pdf.
So, for example, the final regex should take a string like:
parent&&&child&&&grandchild.pdf
and match
parent&&&child
I'm not that great with regex's, so my best effort has been something like:
.*?(?:&&&.*\.pdf)
Which matches the whole string. Can anyone help me out?

You may use this greedy regex either in replace or in match:
var s = 'parent&&&child&&&grandchild.pdf';
// using replace
var r = s.replace(/(.*)&&&.*\.pdf$/, '$1');
console.log(r);
//=> parent&&&child
// using match
var m = s.match(/(.*)&&&.*\.pdf$/)
if (m) {
console.log(m[1]);
//=> parent&&&child
}
By using greedy pattern .* before &&& we make sure to match **last instance of &&& in input.

You want to remove the last portion, so replace it
var str = "parent&&&child&&&grandchild.pdf"
var result = str.replace(/&&&[^&]+\.pdf$/, '')
console.log(result)

Related

find all letters without repeating regex

help me please to find all letters in string without repeating using regex JS.
Examples:
let str = "abczacg";
str = str.match(/ pattern /); // return has to be: abczg
str = "aabbccdd" //return:abcd.
str = "hello world"//return: helo wrd
Is it possible?
Thank you!
Here is one approach. We can first reverse the input string. Then, do a global regex replacement on the following pattern:
(\w)(?=.*\1)
This will strip off any character for which we can find the same character later in the string. But, as we will be running this replacement on the reversed string, this has the actual effect of removing all duplicate letters other than their first occurrence. Finally, we reverse the remaining string again to arrive at the expected output.
var input = "abczacg";
var output = input.split("").reverse().join("").replace(/(\w)(?=.*\1)/g, "");
output = output.split("").reverse().join("");
console.log(output);
An alternative without using a regex using a Set:
[
"abczacg",
"aabbccdd",
"hello world"
].forEach(s => {
console.log([...new Set(s.split(''))].join(''))
})

Javascipt regex to get string between two characters except escaped without lookbehind

I am looking for a specific javascript regex without the new lookahead/lookbehind features of Javascript 2018 that allows me to select text between two asterisk signs but ignores escaped characters.
In the following example only the text "test" and the included escaped characters are supposed to be selected according the rules above:
\*jdjdjdfdf*test*dfsdf\*adfasdasdasd*test**test\**sd* (Selected: "test", "test", "test\*")
During my research I found this solution Regex, everything between two characters except escaped characters /(?<!\\)(%.*?(?<!\\)%)/ but it uses negative lookbehinds which is supported in javascript 2018 but I need to support IE11 as well, so this solution doesn't work for me.
Then i found another approach which is almost getting there for me here: Javascript: negative lookbehind equivalent?. I altered the answer of Kamil Szot to fit my needs: ((?!([\\])).|^)(\*.*?((?!([\\])).|^)\*) Unfortuantely it doesn't work when two asterisks ** are in a row.
I have already invested a lot of hours and can't seem to get it right, any help is appreciated!
An example with what i have so far is here: https://www.regexpal.com/?fam=117350
I need to use the regexp in a string.replace call (str.replace(regexp|substr, newSubStr|function); so that I can wrap the found strings with a span element of a specific class.
You can use this regular expression:
(?:\\.|[^*])*\*((?:\\.|[^*])*)\*
Your code should then only take the (only) capture group of each match.
Like this:
var str = "\\*jdjdjdfdf*test*dfsdf\\*adfasdasdasd*test**test\\**sd*";
var regex = /(?:\\.|[^*])*\*((?:\\.|[^*])*)\*/g
var match;
while (match = regex.exec(str)) {
console.log(match[1]);
}
If you need to replace the matches, for instance to wrap the matches in a span tag while also dropping the asterisks, then use two capture groups:
var str = "\\*jdjdjdfdf*test*dfsdf\\*adfasdasdasd*test**test\\**sd*";
var regex = /((?:\\.|[^*])*)\*((?:\\.|[^*])*)\*/g
var result = str.replace(regex, "$1<span>$2</span>");
console.log(result);
One thing to be careful with: when you use string literals in JavaScript tests, escape the backslash (with another backslash). If you don't do that, the string actually will not have a backslash! To really get the backslash in the in-memory string, you need to escape the backslash.
const testStr = `\\*jdjdjdfdf*test*dfsdf\\*adfasdasdasd*test**test\\**sd*`;
const m = testStr.match(/\*(\\.)*t(\\.)*e(\\.)*s(\\.)*t(\\.)*\*/g).map(m => m.substr(1, m.length-2));
console.log(m);
More generic code:
const prepareRegExp = (word, delimiter = '\\*') => {
const escaped = '(\\\\.)*';
return new RegExp([
delimiter,
escaped,
[...word].join(escaped),
escaped,
delimiter
].join``, 'g');
};
const testStr = `\\*jdjdjdfdf*test*dfsdf\\*adfasdasdasd*test**test\\**sd*`;
const m = testStr
.match(prepareRegExp('test'))
.map(m => m.substr(1, m.length-2));
console.log(m);
https://instacode.dev/#Y29uc3QgcHJlcGFyZVJlZ0V4cCA9ICh3b3JkLCBkZWxpbWl0ZXIgPSAnXFwqJykgPT4gewogIGNvbnN0IGVzY2FwZWQgPSAnKFxcXFwuKSonOwogIHJldHVybiBuZXcgUmVnRXhwKFsKICAgIGRlbGltaXRlciwKICAgIGVzY2FwZWQsCiAgICBbLi4ud29yZF0uam9pbihlc2NhcGVkKSwKICAgIGVzY2FwZWQsCiAgICBkZWxpbWl0ZXIKICBdLmpvaW5gYCwgJ2cnKTsKfTsKCmNvbnN0IHRlc3RTdHIgPSBgXFwqamRqZGpkZmRmKnRlc3QqZGZzZGZcXCphZGZhc2Rhc2Rhc2QqdGVzdCoqdGVzdFxcKipzZCpgOwpjb25zdCBtID0gdGVzdFN0cgoJLm1hdGNoKHByZXBhcmVSZWdFeHAoJ3Rlc3QnKSkKCS5tYXAobSA9PiBtLnN1YnN0cigxLCBtLmxlbmd0aC0yKSk7Cgpjb25zb2xlLmxvZyhtKTs=

javascript regex to find only numbers with hyphen from a string content

In Javascript, from a string like this, I am trying to extract only the number with a hyphen. i.e. 67-64-1 and 35554-44-04. Sometimes there could be more hyphens.
The solvent 67-64-1 is not compatible with 35554-44-04
I tried different regex but not able to get it correctly. For example, this regex gets only the first value.
var msg = 'The solvent 67-64-1 is not compatible with 35554-44-04';
//var regex = /\d+\-?/;
var regex = /(?:\d*-\d*-\d*)/;
var res = msg.match(regex);
console.log(res);
You just need to add the g (global) flag to your regex to match more than once in the string. Note that you should use \d+, not \d*, so that you don't match something like '3--4'. To allow for processing numbers with more hyphens, we use a repeating -\d+ group after the first \d+:
var msg = 'The solvent 67-64-1 is not compatible with 23-35554-44-04 but is compatible with 1-23';
var regex = /\d+(?:-\d+)+/g;
var res = msg.match(regex);
console.log(res);
It gives only first because regex work for first element to test
// g give globel access to find all
var regex = /(?:\d*-\d*-\d*)/g;

RegExp to filter characters after the last dot

For example, I have a string "esolri.gbn43sh.earbnf", and I want to remove every character after the last dot(i.e. "esolri.gbn43sh"). How can I do so with regular expression?
I could of course use non-RegExp way to do it, for example:
"esolri.gbn43sh.earbnf".slice("esolri.gbn43sh.earbnf".lastIndexOf(".")+1);
But I want a regular expression.
I tried /\..*?/, but that remove the first dot instead.
I am using Javascript. Any help is much appreciated.
I would use standard js rather than regex for this one, as it will be easier for others to understand your code
var str = 'esolri.gbn43sh.earbnf'
console.log(
str.slice(str.lastIndexOf('.') + 1)
)
Pattern Matching
Match a dot followed by non-dots until the end of string
let re = /\.[^.]*$/;
Use this with String.prototype.replace to achieve the desired output
'foo.bar.baz'.replace(re, ''); // 'foo.bar'
Other choices
You may find it is more efficient to do a simple substring search for the last . and then use a string slicing method on this index.
let str = 'foo.bar.baz',
i = str.lastIndexOf('.');
if (i !== -1) // i = -1 means no match
str = str.slice(0, i); // "foo.bar"

using a lookahead to get the last occurrence of a pattern in javascript

I was able to build a regex to extract a part of a pattern:
var regex = /\w+\[(\w+)_attributes\]\[\d+\]\[own_property\]/g;
var match = regex.exec( "client_profile[foreclosure_defenses_attributes][0][own_property]" );
match[1] // "foreclosure_defenses"
However, I also have a situation where there will be a repetitive pattern like so:
"client_profile[lead_profile_attributes][foreclosure_defenses_attributes][0][own_property]"
In that case, I want to ignore [lead_profile_attributes] and just extract the portion of the last occurence as I did in the first example. In other words, I still want to match "foreclosure_defenses" in this case.
Since all patterns will be like [(\w+)_attributes], I tried to do a lookahead, but it is not working:
var regex = /\w+\[(\w+)_attributes\](?!\[(\w+)_attributes\])\[\d+\]\[own_property\]/g;
var match = regex.exec("client_profile[lead_profile_attributes][foreclosure_defenses_attributes][0][own_property]");
match // null
match returns null meaning that my regex isn't working as expected. I added the following:
\[(\w+)_attributes\](?!\[(\w+)_attributes\])
Because I want to match only the last occurrence of the following pattern:
[lead_profile_attributes][foreclosure_defenses_attributes]
I just want to grab the foreclosure_defenses, not the lead_profile.
What might I be doing wrong?
I think I got it working without positive lookahead:
regex = /(\[(\w+)_attributes\])+/
/(\[(\w+)_attributes\])+/
match = regex.exec(str);
["[a_attributes][b_attributes][c_attributes]", "[c_attributes]", "c"]
I was able to also achieve it through noncapturing groups. Output from chrome console:
var regex = /(?:\w+(\[\w+\]\[\d+\])+)(\[\w+\])/;
undefined
regex
/(?:\w+(\[\w+\]\[\d+\])+)(\[\w+\])/
str = "profile[foreclosure_defenses_attributes][0][properties_attributes][0][other_stuff]";
"profile[foreclosure_defenses_attributes][0][properties_attributes][0][other_stuff]"
match = regex.exec(str);
["profile[foreclosure_defenses_attributes][0][properties_attributes][0][other_stuff]", "[properties_attributes][0]", "[other_stuff]"]

Categories

Resources