regular expression to remove comment javascript - javascript

I am using below regular expression to remove comments from string
<\!{1}\-{2}(.*?)\-{2}\s*>
This is working fine except for mult-iline string
var search = '<\!{1}\-{2}(.*?)\-{2}\s*>';
var re = new RegExp(search, "gm");
var subject = <multi-line string>;
result = subject.replace(re, '');
what should I do to get it working with multiline strings

. does not allow linebreaks.
This one should work:
^(<\!\-{2})((.|\s)*?)\-{2}>$
Fix:
<!--[\S\s]*?-->
I removed the \s at the beginning and the end of the expression and added it in the middle so multiline-comments are allowed.
But you shoud have a look at BartKs comment ;)
regards

Related

Javascipt regex to get string between two characters except escaped without lookbehind

I am looking for a specific javascript regex without the new lookahead/lookbehind features of Javascript 2018 that allows me to select text between two asterisk signs but ignores escaped characters.
In the following example only the text "test" and the included escaped characters are supposed to be selected according the rules above:
\*jdjdjdfdf*test*dfsdf\*adfasdasdasd*test**test\**sd* (Selected: "test", "test", "test\*")
During my research I found this solution Regex, everything between two characters except escaped characters /(?<!\\)(%.*?(?<!\\)%)/ but it uses negative lookbehinds which is supported in javascript 2018 but I need to support IE11 as well, so this solution doesn't work for me.
Then i found another approach which is almost getting there for me here: Javascript: negative lookbehind equivalent?. I altered the answer of Kamil Szot to fit my needs: ((?!([\\])).|^)(\*.*?((?!([\\])).|^)\*) Unfortuantely it doesn't work when two asterisks ** are in a row.
I have already invested a lot of hours and can't seem to get it right, any help is appreciated!
An example with what i have so far is here: https://www.regexpal.com/?fam=117350
I need to use the regexp in a string.replace call (str.replace(regexp|substr, newSubStr|function); so that I can wrap the found strings with a span element of a specific class.
You can use this regular expression:
(?:\\.|[^*])*\*((?:\\.|[^*])*)\*
Your code should then only take the (only) capture group of each match.
Like this:
var str = "\\*jdjdjdfdf*test*dfsdf\\*adfasdasdasd*test**test\\**sd*";
var regex = /(?:\\.|[^*])*\*((?:\\.|[^*])*)\*/g
var match;
while (match = regex.exec(str)) {
console.log(match[1]);
}
If you need to replace the matches, for instance to wrap the matches in a span tag while also dropping the asterisks, then use two capture groups:
var str = "\\*jdjdjdfdf*test*dfsdf\\*adfasdasdasd*test**test\\**sd*";
var regex = /((?:\\.|[^*])*)\*((?:\\.|[^*])*)\*/g
var result = str.replace(regex, "$1<span>$2</span>");
console.log(result);
One thing to be careful with: when you use string literals in JavaScript tests, escape the backslash (with another backslash). If you don't do that, the string actually will not have a backslash! To really get the backslash in the in-memory string, you need to escape the backslash.
const testStr = `\\*jdjdjdfdf*test*dfsdf\\*adfasdasdasd*test**test\\**sd*`;
const m = testStr.match(/\*(\\.)*t(\\.)*e(\\.)*s(\\.)*t(\\.)*\*/g).map(m => m.substr(1, m.length-2));
console.log(m);
More generic code:
const prepareRegExp = (word, delimiter = '\\*') => {
const escaped = '(\\\\.)*';
return new RegExp([
delimiter,
escaped,
[...word].join(escaped),
escaped,
delimiter
].join``, 'g');
};
const testStr = `\\*jdjdjdfdf*test*dfsdf\\*adfasdasdasd*test**test\\**sd*`;
const m = testStr
.match(prepareRegExp('test'))
.map(m => m.substr(1, m.length-2));
console.log(m);
https://instacode.dev/#Y29uc3QgcHJlcGFyZVJlZ0V4cCA9ICh3b3JkLCBkZWxpbWl0ZXIgPSAnXFwqJykgPT4gewogIGNvbnN0IGVzY2FwZWQgPSAnKFxcXFwuKSonOwogIHJldHVybiBuZXcgUmVnRXhwKFsKICAgIGRlbGltaXRlciwKICAgIGVzY2FwZWQsCiAgICBbLi4ud29yZF0uam9pbihlc2NhcGVkKSwKICAgIGVzY2FwZWQsCiAgICBkZWxpbWl0ZXIKICBdLmpvaW5gYCwgJ2cnKTsKfTsKCmNvbnN0IHRlc3RTdHIgPSBgXFwqamRqZGpkZmRmKnRlc3QqZGZzZGZcXCphZGZhc2Rhc2Rhc2QqdGVzdCoqdGVzdFxcKipzZCpgOwpjb25zdCBtID0gdGVzdFN0cgoJLm1hdGNoKHByZXBhcmVSZWdFeHAoJ3Rlc3QnKSkKCS5tYXAobSA9PiBtLnN1YnN0cigxLCBtLmxlbmd0aC0yKSk7Cgpjb25zb2xlLmxvZyhtKTs=

Using search and replace with regex in javascript

I have a regular expression that I have been using in notepad++ for search&replace to manipulate some text, and I want to incorporate it into my javascript code. This is the regular expression:
Search
(?-s)(.{150,250}\.(\[\d+\])*)\h+ and replace with \1\r\n\x20\x20\x20
In essence creating new paragraphs for every 150-250 words and indenting them.
This is what I have tried in JavaScript. For a text area <textarea name="textarea1" id="textarea1"></textarea>in the HTML. I have the following JavaScript:
function rep1() {
var re1 = new RegExp('(?-s)(.{150,250}\.(\[\d+\])*)\h+');
var re2 = new RegExp('\1\r\n\x20\x20\x20');
var s = document.getElementById("textarea1").value;
s = string.replace(re1, re2);
document.getElementById("textarea1").value = s;
}
I have also tried placing the regular expressions directly as arguments for string.replace() but that doesn't work either. Any ideas what I'm doing wrong?
Several issues:
JavaScript does not support (?-s). You would need to add modifiers separately. However, this is the default setting in JavaScript, so you can just leave it out. If it was your intention to let . also match line breaks, then use [^] instead of . in JavaScript regexes.
JavaScript does not support \h -- the horizontal white space. Instead you could use [^\S\r\n].
When passing a string literal to new RegExp be aware that backslashes are escape characters for the string literal notation, so they will not end up in the regex. So either double them, or else use JavaScript's regex literal notation
In JavaScript replace will only replace the first occurrence unless you provided the g modifier to the regular expression.
The replacement (second argument to replace) should not be a regex. It should be a string, so don't apply new RegExp to it.
The backreferences in the replacement string should be of the $1 format. JavaScript does not support \1 there.
You reference string where you really want to reference s.
This should work:
function rep1() {
var re1 = /(.{150,250}\.(\[\d+\])*)[^\S\r\n]+/g;
var re2 = '$1\r\n\x20\x20\x20';
var s = document.getElementById("textarea1").value;
s = s.replace(re1, re2);
document.getElementById("textarea1").value = s;
}

remove last part of string following '&&&' with JavaScript Regex

I'm trying to use a regex in JS to remove the last part of a string. This substring starts with &&&, is followed by something not &&&, and ends with .pdf.
So, for example, the final regex should take a string like:
parent&&&child&&&grandchild.pdf
and match
parent&&&child
I'm not that great with regex's, so my best effort has been something like:
.*?(?:&&&.*\.pdf)
Which matches the whole string. Can anyone help me out?
You may use this greedy regex either in replace or in match:
var s = 'parent&&&child&&&grandchild.pdf';
// using replace
var r = s.replace(/(.*)&&&.*\.pdf$/, '$1');
console.log(r);
//=> parent&&&child
// using match
var m = s.match(/(.*)&&&.*\.pdf$/)
if (m) {
console.log(m[1]);
//=> parent&&&child
}
By using greedy pattern .* before &&& we make sure to match **last instance of &&& in input.
You want to remove the last portion, so replace it
var str = "parent&&&child&&&grandchild.pdf"
var result = str.replace(/&&&[^&]+\.pdf$/, '')
console.log(result)

regular expression to extract two items from a long string

There are some strings having the following type of format,
{abc=1234457, cde=3, label=3352-4e9a-9022-1067ca63} <chve> abc? 123.456.789, http=appl.com
I would like to extract 1234457 and 3352-4e9a-9022-1067ca63, which correspond to abc and label respectively.
This is the javascript I have been trying to use, but it does not work. I think the regular expression part is wrong.
var headerPattern = new RegExp("\{abc=([\d]*),,label=(.*)(.*)");
if (headerPattern.test(row)) {
abc = headerPattern.exec(row)[0];
label = headerPattern.exec(row)[1];
}
Try: abc=(\d*).*?label=([^}]*)
Explanation
abc= literal match
(\d*) catch some numbers
.*? Lazy match
label= literal match
([^}]*) catch all the things that aren't the closing brace
Here is what I came up with:
\{abc=(\d+).*label=(.+)\}.*
Your have two problems in \{abc=([\d]*),,label=(.*)(.*):
Using abc=([\d]*),,, you are looking for abc=([\d]*) followed by the literal ,,. You should use .* instead. Since .* is nongreedy be default, it will not match past the label.
By using label=(.*)(.*), the first .* captures all the remaining text. You want to only catch text until the edge of the braces, so use (.*)}.*.
Disclaimer: Made with a Java-based regex tester. If anything in JavaScript regexes would invalidate this, feel free to comment.
You can do it the following way:
var row = '{abc=1234457, cde=3, label=3352-4e9a-9022-1067ca63} <chve> abc? 123.456.789, http=appl.com';
var headerPatternResult = /{abc=([0-9]+),.*?label=([a-z0-9\-]+)}/.exec(row);
if (headerPatternResult !== null) {
var abc = headerPatternResult[1];
var label = headerPatternResult[2];
console.log('abc: ' + abc);
console.log('label: ' + label);
}

Grab the end of a URL after the last slash with regex in javascript

I need to be able to grab the number at the end of the url, and set it as the value of a textbox. I have the following, but it's not correctly stripping out the beginning of the URL before the last slash. Instead, its doing the opposite.
<input id="imageid"></input>
var referrerURL = "http://subdomain.xx-xxxx-x.xxx.url.com/content/assets/750";
var assetID = referrerURL.match("^(.*[\\\/])");
$("#imageid").val(assetID);
The result of the regex match should set the value of the text box to 750 in this case.
JSFiddle: Link
The simple method is to use a negated character class as
/[^\/]*$/
Regex Demo
Example
var referrerURL = "http://subdomain.xx-xxxx-x.xxx.url.com/content/assets/750";
alert(referrerURL.match(/[^\/]*$/));
// Output
// => 750
Can use a simple split() and then pop() the resultant array
var assetID = referrerURL.split('/').pop();
Easier to read than a regex thus very clear what it is doing
DEMO
var referrerURL = "http://subdomain.xx-xxxx-x.xxx.url.com/content/assets/750";
var myregexp = /.*\/(.*?)$/;
var match = myregexp.exec(referrerURL);
$("#imageid").val(match[1]);
<script src="https://ajax.googleapis.com/ajax/libs/jquery/1.11.1/jquery.min.js"></script>
<input id="imageid"></input>
You could try avoiding the usage of regular expression for this task just by using native javascript's string functions.
Splitting the text:
var lastSlashToken = referrerURL.split("/").pop(-1);
Looking up for the last ending "/text" token:
var lastSlashToken = referrerURL.substr(referrerURL.lastIndexOf("/") + 1);
However, if you still want to use regular expression for this task, you could try using the following pattern:
.*\/([^$]+)
Working DEMO example # regex101

Categories

Resources