RegEx conversion to use with Javascript - javascript

I'm currently using this RegEx ^(0[1-9]|1[0-2])/(19|2[0-1])\d{2}$ in .NET to validate a field with Month and Year (12/2000).
I'm changing all my RegEx validations to JavaScript and I'm facing an issue with this one because of /in the middle which I'm having problems escaping.
So based on other answers in SO I tried:
RegExp.quote = function (str) {
return (str + '').replace(/[.?*+^$[\]\\(){}|-]/g, "\\$&");
};
var reDOB = '^(0[1-9]|1[0-2])/(19|2[0-1])\d{2}$'
var re = new RegExp(RegExp.quote(reDOB));
if (!re.test(args.Value)) {
args.IsValid = false;
return;
}
However, validations fails even with valid data.

Remove ^ of first and $ from end of regex pattern. And add \ before any character which you want to match by pattern. so pattern is like this:
(0[1-9]|1[0-2])\/(19|2[0-1])\d{2}
You can test your regex from here

Related

JS - Nothing to repeat In match function

The error is simple. In JS I try to do somtehing to similar a preg_match in PHP. I found match function. I use this function to compare a value with strings elements. If found something return true, else return false.
I tried this
var sim_action = $(this);
if(sim_action.data("phone").toString().match("/^(+34|0034|34)+([67]){8})$/")){
But return this error.
Invalid regular expression: //^(+34|0034|34)+([67]){8})$//: Nothing
to repeat
So the question is. How can i add this string in JS match function?
You need to escape the + characters with a backslash: /^(\+34|0034|34)\+([67]){8})$/. You also have a closing bracket which doesn't have a matching opening bracket.
+ and () are metacharacters and if you want to refer to the literal, you need to escape them with a \. Here's a regex101 demo which highlights the errors with your regex
As for the regex, from wikipedia, I gather that spanish phone numbers have the format +34(6|7)xxxxxxxx
You can use this regex: /^(\+34|0034|34)[67]\d{8}$/
If you just want to check if the regex passes , you can use regex.test(<stringToBeTested>)
const regex = /^(\+34|0034|34)[67]\d{8}$/
const phone = "+34712345673";
if (regex.test(phone))
console.log("Valid phone number")
const phoneNumbers = ["+34712345673", "0034612345673", "+34812345673"]
phoneNumbers.forEach(p => console.log(regex.test(p)))

Invalid expression term '['

I have written a regular expression for validating email and the regular expression is like this. But it is giving me the error of invalid expression term in the regex string after the '#' part.Below is the regex string.
/^([\w-\.]+#([\w-]+\.)+[\w-]{2,4})?$/;
Thanks
Your regex is working fine.
You don't need to escape char other than [ or ] inside a set. Use [\w-.], not [\w-\.] (just an optimization).
Additional note
If you're using var re = new RegExp("...") instead of var re = /.../ you need to escape backslashes or it won't works :
// Works
var re = /^([\w-.]+#([\w-]+\.)+[\w-]{2,4})?$/;
console.log(re.test('marty#bbtf.com'));
// Also works
re = new RegExp("^([\\w-.]+#([\\w-]+\\.)+[\\w-]{2,4})?$");
console.log(re.test('marty#bbtf.com'));
// Does NOT work
re = new RegExp("^([\w-.]+#([\w-]+\.)+[\w-]{2,4})?$");
console.log(re.test('marty#bbtf.com'));
The regex works fine. You need to escape special symbols. Like [ to \[ You could use a function for that:
function escapeRegExp(str) {
return str.replace(/[\-\[\]\/\{\}\(\)\*\+\?\.\\\^\$\|]/g, "\\$&");
}

Regex not working properly in Javascript code

I have a JavaScript function that fires successfully on the onkeypress/onkeyup event for an asp.net textbox control as follows:
<asp:TextBox ID="txtboxLatestTag" runat="server" onkeypress="validate()" onkeyup="validate()"></asp:TextBox>
function validate() {
var str = $("#txtboxLatestTag").val();
var pattern = /^\d{1,2}[.]\d{1,2}[.]\d{1,2}[.]\d{1,2}/gm
if (!str.match(pattern))
{
document.getElementById("txtboxLatestTag").style.color = "red";
}
else
{
document.getElementById("txtboxLatestTag").style.color = "white";
}
The regex is supposed to match entries in the format of:
10.10.10.10 or
1.1.1.1
or anything allowing 1 to 2 digits between each "." character.
This works, however the problem is that it ALSO matches with
1.1.1.100 i.e. it should not allow 3 numbers at the end of the string, only 2.
This works perfectly in regexr.com but I cannot figure out why it is matching on this.
Thank you
I believe what you want to do to exclude extra characters at the end of the string is add in the end of input character $ (or end-of-line character, since you're using multiline mode). This will cause extra characters at the end to invalidate the match. For example:
var oldPattern = /^\d{1,2}[.]\d{1,2}[.]\d{1,2}[.]\d{1,2}/gm;
console.log("Old pattern match:");
console.log("10.10.10.100".match(oldPattern));
var pattern = /^\d{1,2}[.]\d{1,2}[.]\d{1,2}[.]\d{1,2}$/gm;
console.log("New pattern match:");
console.log("10.10.10.100".match(pattern));
console.log("10.10.10.1".match(pattern));

Removing a query string using regex in java script

I have a requirement of removing a query parameter coming with a REST API call. Below are the sample URLs which need to be considered. In each of this URL, we need to remove 'key' parameter and its value.
/test/v1?key=keyval&param1=value1&param2=value2
/test/v1?key=keyval
/test/v1?param1=value1&key=keyval
/test/v1?param1=value1&key=keyval&param2=value2
After removing the key parameter, the final URLs should be as follows.
/test/v1?param1=value1&param2=value2
/test/v1?
/test/v1?param1=value1
/test/v1?param1=value1=&param2=value2
We used below regex expression to match and replace this query string in php. (https://regex101.com/r/pK0dX3/1)
(?<=[?&;])key=.*?($|[&;])
We couldn't use the same regex in java script. Once we use it in java script it gives some syntax errors. Can you please help us to figure out the issue with the same regex ? How can we change this regex to match and remove query parameter as mentioned above?
Obviously lookbehind isn't supported in Javascript hence your regex won't work.
In Javascript you can use this:
repl = input.replace(/(\?)key=[^&]*(?:&|$)|&key=[^&]*/gmi, '$1');
RegEx Demo
Regex is working on 2 paths using regex alternation:
If this query parameter is right after ? then we grab till & after parameter and place ? back in replacement.
If this query parameter is after & then &key=value is replaced by an empty string.
The regex works in PHP but not in Javascript because Javascript does not support lookbehind.
The easiest fix here would be to replace the lookbehind (?<=[?&;]) with the equivalent characters in a capturing group ([?&;]) and use a backreference ($1) to insert this bit back into the replacement string.
For example:
var path = '/test/v1?key=keyval&param1=value1&param2=value2';
var regex = /([?&;])key=.*?($|[&;])/;
console.log(path.replace(regex, '$1'); // outputs '/test/v1?param1=value1&param2=value2'
Not convinced regex would be the most reliable way of removing a query parameter, but that's a different story :-)
Just in case you want to do it without a regex, here is a function that will do the trick:
var removeQueryString = function (str) {
var qm = str.lastIndexOf('?');
var path = str.substr(0, qm + 1);
var querystr = str.substr(qm + 1);
var params = querystr.split('&');
var keyIndex = -1;
for (var i = 0; i < params.length; i++) {
if (params[i].indexOf("key=") === 0) {
keyIndex = i;
break;
}
}
if (keyIndex != -1) {
params.splice(keyIndex, 1);
}
var result = path + params.join('&');
return result;
};
The lookbehind feature isn't available in javascript, so to test the character before the key/value, you must match it. To make the pattern works whatever the position in the query part of the url, you can use an alternation in a non-capturing group, and you capture the question mark:
url = url.replace(/(?:&|(\?))key=[^&#]*(?:(?!\1).)?/, '$1');
Note: the # is excluded from the character class to prevent the fragment part (if any) of the url to be matched with key value.

What is the memory issue in this RegEx function

I am trying to scrape a web page for email addresses. I almost have it working, but there seems to be some kind of huge memory error that makes the page freeze when my script loads.
This is what I have:
var bodyText = document.body.textContent.replace(/\n/g, " ").split(' '); // Location to pull our text from. In this case it's the whole body
var r = new RegExp("[a-z0-9!#$%&'*+\/=?^_`{|}~-]+(?:\.[a-z0-9!#$%&'*+\/=?^_`{|}~-]+)*#(?:[a-z0-9](?:[a-z0-9-]*[a-z0-9])?\.)+[a-z0-9](?:[a-z0-9-]*[a-z0-9])", 'i');
function validateEmail(string) {
return r.test(string);
}
var domains = [];
var domain;
for (var i = 0; i < bodyText.length; i++){
domain = bodyText[i].toString();
if (validateEmail(domain)) {
domains.push(domain);
}
}
The only thing I can think of is that the email validating function I'm using is a 32 step expression and the page I'm running it on returns with over 3,000 parts, but I feel like this should be possible.
Here is a script that reproduces the error:
var str = "help.yahoo.com/us/tutorials/cg/mail/cg_addressguard2.html";
var r = new RegExp("[a-z0-9!#$%&'*+\/=?^_{|}~-]+(?:\.[a-z0-9!#$%&'*+\/=?^_{|}~-]+)*#(?:[a-‌​z0-9](?:[a-z0-9-]*[a-z0-9])?\.)+[a-z0-9](?:[a-z0-9-]*[a-z0-9])", 'i');
console.log("before:"+(new Date()));
console.log(r.test(str));
console.log("after:"+(new Date()));`
What can I do to overcome the memory issue?
stribizhev has pointed out the solution in the comment: specify the regex in RegExp literal syntax. Another solution, as shown in the comment by sln, is to escape \ in the string literal properly.
I will not address what is the correct regex to validating/matching email address with regex in this answer, since it has been rehashed many times over.
To demonstrate what causes the problem, let us print the string passed to RegExp constructor to the console. Did you notice that some \ are missing?
[a-z0-9!#$%&'*+/=?^_`{|}~-]+(?:.[a-z0-9!#$%&'*+/=?^_`{|}~-]+)*#(?:[a-z0-9](?:[a-z0-9-]*[a-z0-9])?.)+[a-z0-9](?:[a-z0-9-]*[a-z0-9])
^ ^ ^ ^
The string above is what the RegExp constructor sees and compiles.
/ only needs to be escaped in RegExp literal (since RegExp literals are delimited by /), and doesn't need to be escaped in the string passes to RegExp constructor, so the omission doesn't cause any problem.
Below are equivalent examples showing how to write a regex to match / with RegExp literal and RegExp constructor:
/\//;
new RegExp("/");
However, since \ in \. is not properly escaped in the string, instead of matching literal ., it allows any character (except for line separator) to be matched.
As a result, from being perfectly fine solution, these parts in the regex suffers from catastrophic backtracking:
(?:.[a-z0-9!#$%&'*+/=?^_`{|}~-]+)*
(?:[a-z0-9](?:[a-z0-9-]*[a-z0-9])?.)+
Since . can match any character, the fragments above degenerates to the classic catastrophic backtracking pattern (A*)*. By reducing the power of the regex to its strict subset, you can see the problem more clearly:
(?:a[a]+)*
(?:[a](?:[a]*[a])?a)+
This is the solution with RegExp literal, which is the same as specified in the string literal in the question. You got the escape for RegExp literal done properly, but instead use it in RegExp constructor:
var r = /[a-z0-9!#$%&'*+\/=?^_`{|}~-]+(?:\.[a-z0-9!#$%&'*+\/=?^_`{|}~-]+)*#(?:[a-z0-9](?:[a-z0-9-]*[a-z0-9])?\.)+[a-z0-9](?:[a-z0-9-]*[a-z0-9])/i;
As for equivalent RegExp constructor solution:
var r = new RegExp("[a-z0-9!#$%&'*+/=?^_`{|}~-]+(?:\\.[a-z0-9!#$%&'*+/=?^_`{|}~-]+)*#(?:[a-z0-9](?:[a-z0-9-]*[a-z0-9])?\\.)+[a-z0-9](?:[a-z0-9-]*[a-z0-9])", "i");
Not exactly an answer to your question, but the first thing you need to do is to reduce the amount of text parts you have to test with your "corrected" pattern. In your html example file, you have about 3300 text strings to test with a regex. Keep in mind that using a regex has a cost, so removing useless text part is a priority:
var textParts = document.body.textContent
.split(/\s+/) // see the note
.filter(function(part) {
return part.length > 4 && part.length < 255 && part.indexOf('#') > 1;
});
alert(textParts.join("\n"));
Now you have only ~50 text parts to test.
note: if you want to take in account email addresses with spaces inside double quotes, you can try to change:
.split(/\s+/)
to
.split(/(?=[\s"])((?:"[^"\n\\]*(?:\\.[^"\n\\]*)*"[^"\s]*)*)(?:\s+|$)/)
(without any warranty)
About your pattern: the mistake in your pattern is already pointed by other answers and comments, but note that you can probably obtain the same result (the same matches) faster with this one:
/\b\w[!#-'*+\/-9=?^-~-]*(?:\.[!#-'*+\/-9=?^-~-]+)*#[a-z0-9]+(?:-[a-z0-9]+)*\.[a-z0-9]+(?:[-.][a-z0-9]+)*\b/i
Here's an example with a less strict regex that's fast.
function getEmails(str) {
var r = /\b[A-Z0-9._%+-]+#[A-Z0-9.-]+\.[A-Z]{2,4}\b/ig;
var emails = [];
var e = null;
var n = 0;
while ((e = r.exec(str)) !== null) {
emails[n++] = e[0];
}
return emails;
}
function emailTest() {
var str = document.getElementsByTagName('body')[0].innerHTML;
var emails = getEmails(str);
document.getElementById('found').innerHTML=emails.join("\n");
}
emailTest();
#found {
color:green;
font-weight:bold;
}
<pre id="email_test">
test#test.test
foo#bar.baz.test
foo#bar.baz.longdomain
foo-bar#foo.bar
foo_bar99#foo.bar
foo#foo#foo.bar
foo$bar#33#test.test
foo+bar-baz%99#someplace.top
</pre>
<pre id="found"></pre>

Categories

Resources