I cannot replace characters using Regex and variable - javascript

I've been trying to use Regex in my Javascript codes for to collapse indicated gaps like the below. But I couldn't manage that;
this works;
outputFile = outputFile.replace(/\s*<div/g, '<div');
these don't work;
htmlElements = [
"div",
"form",
"label",
"input"
];
var exp = new RegExp("\s*<"+htmlElements[0], 'g')
var str = "<"+htmlElements[0];
outputFile = outputFile.replace(exp, str);
Exactly the same expressions except using variable. Also I checked my expression on here https://regex101.com/r/eJ5kJ2/2 and here http://regexper.com/#%2F%5Cs*%3Cdiv%2F. And I tried both on Chrome and Firefox too.
Is there any chance to overcome this issue?

You have to escape all \s to \\ because you define your regex as string, not as regex literal
new RegExp("\\s*<"+htmlElements[0], 'g')
Beside that: you may want to use a html parser instead of regex to accomplish your task

Related

How can you add e.g. 'gm' to a regex to avoid repeating the full regex again? [duplicate]

I am trying to create something similar to this:
var regexp_loc = /e/i;
except I want the regexp to be dependent on a string, so I tried to use new RegExp but I couldn't get what i wanted.
Basically I want the e in the above regexp to be a string variable but I fail with the syntax.
I tried something like this:
var keyword = "something";
var test_regexp = new RegExp("/" + keyword + "/i");
Basically I want to search for a sub string in a larger string then replace the string with some other string, case insensitive.
regards,
alexander
You need to pass the second parameter:
var r = new RegExp(keyword, "i");
You will also need to escape any special characters in the string to prevent regex injection attacks.
You should also remember to watch out for escape characters within a string...
For example if you wished to detect for a single number \d{1} and you did this...
var pattern = "\d{1}";
var re = new RegExp(pattern);
re.exec("1"); // fail! :(
that would fail as the initial \ is an escape character, you would need to "escape the escape", like so...
var pattern = "\\d{1}" // <-- spot the extra '\'
var re = new RegExp(pattern);
re.exec("1"); // success! :D
When using the RegExp constructor, you don't need the slashes like you do when using a regexp literal. So:
new RegExp(keyword, "i");
Note that you pass in the flags in the second parameter. See here for more info.
Want to share an example here:
I want to replace a string like: hi[var1][var2] to hi[newVar][var2].
and var1 are dynamic generated in the page.
so I had to use:
var regex = new RegExp("\\\\["+var1+"\\\\]",'ig');
mystring.replace(regex,'[newVar]');
This works pretty good to me. in case anyone need this like me.
The reason I have to go with [] is var1 might be a very easy pattern itself, adding the [] would be much accurate.
var keyword = "something";
var test_regexp = new RegExp(something,"i");
You need to convert RegExp, you actually can create a simple function to do it for you:
function toReg(str) {
if(!str || typeof str !== "string") {
return;
}
return new RegExp(str, "i");
}
and call it like:
toReg("something")

Javascipt regex to get string between two characters except escaped without lookbehind

I am looking for a specific javascript regex without the new lookahead/lookbehind features of Javascript 2018 that allows me to select text between two asterisk signs but ignores escaped characters.
In the following example only the text "test" and the included escaped characters are supposed to be selected according the rules above:
\*jdjdjdfdf*test*dfsdf\*adfasdasdasd*test**test\**sd* (Selected: "test", "test", "test\*")
During my research I found this solution Regex, everything between two characters except escaped characters /(?<!\\)(%.*?(?<!\\)%)/ but it uses negative lookbehinds which is supported in javascript 2018 but I need to support IE11 as well, so this solution doesn't work for me.
Then i found another approach which is almost getting there for me here: Javascript: negative lookbehind equivalent?. I altered the answer of Kamil Szot to fit my needs: ((?!([\\])).|^)(\*.*?((?!([\\])).|^)\*) Unfortuantely it doesn't work when two asterisks ** are in a row.
I have already invested a lot of hours and can't seem to get it right, any help is appreciated!
An example with what i have so far is here: https://www.regexpal.com/?fam=117350
I need to use the regexp in a string.replace call (str.replace(regexp|substr, newSubStr|function); so that I can wrap the found strings with a span element of a specific class.
You can use this regular expression:
(?:\\.|[^*])*\*((?:\\.|[^*])*)\*
Your code should then only take the (only) capture group of each match.
Like this:
var str = "\\*jdjdjdfdf*test*dfsdf\\*adfasdasdasd*test**test\\**sd*";
var regex = /(?:\\.|[^*])*\*((?:\\.|[^*])*)\*/g
var match;
while (match = regex.exec(str)) {
console.log(match[1]);
}
If you need to replace the matches, for instance to wrap the matches in a span tag while also dropping the asterisks, then use two capture groups:
var str = "\\*jdjdjdfdf*test*dfsdf\\*adfasdasdasd*test**test\\**sd*";
var regex = /((?:\\.|[^*])*)\*((?:\\.|[^*])*)\*/g
var result = str.replace(regex, "$1<span>$2</span>");
console.log(result);
One thing to be careful with: when you use string literals in JavaScript tests, escape the backslash (with another backslash). If you don't do that, the string actually will not have a backslash! To really get the backslash in the in-memory string, you need to escape the backslash.
const testStr = `\\*jdjdjdfdf*test*dfsdf\\*adfasdasdasd*test**test\\**sd*`;
const m = testStr.match(/\*(\\.)*t(\\.)*e(\\.)*s(\\.)*t(\\.)*\*/g).map(m => m.substr(1, m.length-2));
console.log(m);
More generic code:
const prepareRegExp = (word, delimiter = '\\*') => {
const escaped = '(\\\\.)*';
return new RegExp([
delimiter,
escaped,
[...word].join(escaped),
escaped,
delimiter
].join``, 'g');
};
const testStr = `\\*jdjdjdfdf*test*dfsdf\\*adfasdasdasd*test**test\\**sd*`;
const m = testStr
.match(prepareRegExp('test'))
.map(m => m.substr(1, m.length-2));
console.log(m);
https://instacode.dev/#Y29uc3QgcHJlcGFyZVJlZ0V4cCA9ICh3b3JkLCBkZWxpbWl0ZXIgPSAnXFwqJykgPT4gewogIGNvbnN0IGVzY2FwZWQgPSAnKFxcXFwuKSonOwogIHJldHVybiBuZXcgUmVnRXhwKFsKICAgIGRlbGltaXRlciwKICAgIGVzY2FwZWQsCiAgICBbLi4ud29yZF0uam9pbihlc2NhcGVkKSwKICAgIGVzY2FwZWQsCiAgICBkZWxpbWl0ZXIKICBdLmpvaW5gYCwgJ2cnKTsKfTsKCmNvbnN0IHRlc3RTdHIgPSBgXFwqamRqZGpkZmRmKnRlc3QqZGZzZGZcXCphZGZhc2Rhc2Rhc2QqdGVzdCoqdGVzdFxcKipzZCpgOwpjb25zdCBtID0gdGVzdFN0cgoJLm1hdGNoKHByZXBhcmVSZWdFeHAoJ3Rlc3QnKSkKCS5tYXAobSA9PiBtLnN1YnN0cigxLCBtLmxlbmd0aC0yKSk7Cgpjb25zb2xlLmxvZyhtKTs=

Using search and replace with regex in javascript

I have a regular expression that I have been using in notepad++ for search&replace to manipulate some text, and I want to incorporate it into my javascript code. This is the regular expression:
Search
(?-s)(.{150,250}\.(\[\d+\])*)\h+ and replace with \1\r\n\x20\x20\x20
In essence creating new paragraphs for every 150-250 words and indenting them.
This is what I have tried in JavaScript. For a text area <textarea name="textarea1" id="textarea1"></textarea>in the HTML. I have the following JavaScript:
function rep1() {
var re1 = new RegExp('(?-s)(.{150,250}\.(\[\d+\])*)\h+');
var re2 = new RegExp('\1\r\n\x20\x20\x20');
var s = document.getElementById("textarea1").value;
s = string.replace(re1, re2);
document.getElementById("textarea1").value = s;
}
I have also tried placing the regular expressions directly as arguments for string.replace() but that doesn't work either. Any ideas what I'm doing wrong?
Several issues:
JavaScript does not support (?-s). You would need to add modifiers separately. However, this is the default setting in JavaScript, so you can just leave it out. If it was your intention to let . also match line breaks, then use [^] instead of . in JavaScript regexes.
JavaScript does not support \h -- the horizontal white space. Instead you could use [^\S\r\n].
When passing a string literal to new RegExp be aware that backslashes are escape characters for the string literal notation, so they will not end up in the regex. So either double them, or else use JavaScript's regex literal notation
In JavaScript replace will only replace the first occurrence unless you provided the g modifier to the regular expression.
The replacement (second argument to replace) should not be a regex. It should be a string, so don't apply new RegExp to it.
The backreferences in the replacement string should be of the $1 format. JavaScript does not support \1 there.
You reference string where you really want to reference s.
This should work:
function rep1() {
var re1 = /(.{150,250}\.(\[\d+\])*)[^\S\r\n]+/g;
var re2 = '$1\r\n\x20\x20\x20';
var s = document.getElementById("textarea1").value;
s = s.replace(re1, re2);
document.getElementById("textarea1").value = s;
}

javascript, declare associative array of regex expressions

How to declare associative array of regex?
This is not working
var Validators = {
url : /^http(s?)://((\w+\.)?\w+\.\w+|((2[0-5]{2}|1[0-9]{2}|[0-9]{1,2})\.){3}(2[0-5]{2}|1[0-9]{2}|[0-9]{1,2}))(/)?$/gm
};
EDITED: Now working!
This will be valid in JS (like # operator in C#)
url : `/^http(s?)://((\w+\.)?\w+\.\w+|((2[0-5]{2}|1[0-9]{2}|[0-9]{1,2})\.){3}(2[0-5]{2}|1[0-9]{2}|[0-9]{1,2}))(/)?$/gm`
However, will still not work due to double escape, one in JS and other in Regex. If expression is small, perhaps naked eye can manually escape for both JS and Regex. My brain just can't :)
In order to use strings as tested on regex101.com for example, all required strings should be declared as 'row' like this:
var exp = String.raw`^(http(s?):\/\/)?(((www\.)?[a-zA-Z0-9\.\-\_]+(\.[a-zA-Z]{2,3})+)|(\b(?:(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.){3}(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\b))(\/[a-zA-Z0-9\_\-\s\.\/\?\%\#\&\=]*)?$`;
var strings = [
String.raw`http://www.goo gle.com`,
String.raw`http://www.google.com`,
];
Wrap it with new RegExp() and escape slashes
var Validators = {
url : new RegExp( /^http(s?):\/\/((\w+\.)?\w+\.\w+|((2[0-5]{2}|1[0-9]{2}|[0-9]{1,2})\.){3}(2[0-5]{2}|1[0-9]{2}|[0-9]{1,2}))(\/)?$/gm )
};
Your regex has forward slashes in it. This symbol needs to be escaped because it is supposed to indicate the start and end of the expression. Try \/.

javascript regex does not work properly with postal code

'^[AaBbCcEeGgHhJjKkLlMmNnPpRrSsTtVvXxYy]{1}\d{1}[AaBbCcEeFfGgHhJjKkLlMmNnPpRrSsTtVvWwXxYyZz]{1}[ -]*\d{1}[AaBbCcEeFfGgHhJjKkLlMmNnPpRrSsTtVvWwXxYyZz]{1}\d{1}$'
the above regular expression accepts inputs like T3K2H3 or T3K-2H3 from .net form but when i run the validation through the javascript; it does not work.
var rxPostalCode = new RegExp('^[AaBbCcEeGgHhJjKkLlMmNnPpRrSsTtVvXxYy]{1}\d{1}[AaBbCcEeFfGgHhJjKkLlMmNnPpRrSsTtVvWwXxYyZz]{1}[ -]*\d{1}[AaBbCcEeFfGgHhJjKkLlMmNnPpRrSsTtVvWwXxYyZz]{1}\d{1}$');
var postalCode = 't3k2h3';
var matchesPostalCode = rxPostalCode.exec(postalCode);
if (matchesPostalCode == null || postalCode != matchesPostalCode[0]) {
$scope.AccountInfoForm.PostalCode.$setValidity("pattern", false);
$scope.showLoading = false;
return false;
}
I believe that in javascript, you have to do // instead of ''
as follows:
/^[AaBbCcEeGgHhJjKkLlMmNnPpRrSsTtVvXxYy]{1}\d{1}[AaBbCcEeFfGgHhJjKkLlMmNnPpRrSsTtVvWwXxYyZz]{1}[ -]*\d{1}[AaBbCcEeFfGgHhJjKkLlMmNnPpRrSsTtVvWwXxYyZz]{1}\d{1}$/
You might want to check the following link:
Validate email address in JavaScript?
You have two syntaxes to define a regexp object:
var rxPostalCode = /^[abceghj-np-tvxy]\d[abceghj-np-tv-z][ -]?\d[abceghj-np-tv-z]\d$/i;
or
var rxPostalCode = new RegExp('^[abceghj-np-tvxy]\\d[abceghj-np-tv-z][ -]?\\d[abceghj-np-tv-z]\\d$', 'i');
Note that with the second syntax you need to use double backslashes.
https://developer.mozilla.org/en-US/docs/Web/JavaScript/Guide/Regular_Expressions
"Do not forget to escape \ itself while using the RegExp("pattern") notation because \ is also an escape character in strings."
var rxPostalCode = new RegExp('^[AaBbCcEeGgHhJjKkLlMmNnPpRrSsTtVvXxYy]{1}\\d{1}[AaBbCcEeFfGgHhJjKkLlMmNnPpRrSsTtVvWwXxYyZz]{1}[ -]*\\d{1}[AaBbCcEeFfGgHhJjKkLlMmNnPpRrSsTtVvWwXxYyZz]{1}\\d{1}$');
That should work, I tested it in Chrome's console.
Try the following pattern:
^[AaBbCcEeGgHhJjKkLlMmNnPpRrSsTtVvXxYy]\d
[AaBbCcEeFfGgHhJjKkLlMmNnPpRrSsTtVvWwXxYyZz][ -]*\d
[AaBbCcEeFfGgHhJjKkLlMmNnPpRrSsTtVvWwXxYyZz]\d
Remove the $ at the end and see if that solves your problem.
I also simplified things a bit, the \d{1} is the same as \d
I would also change the [ -]* to [ -]? unless you want to allow multiple spaces or dashes
I suspect what is happening is that the $ expect the end of the line or string, and JavaScript may not store the VAR properly. See if remove the $ solves it, or possibly keeping the $ and trim() the string.

Categories

Resources