Regex produces different result in javascript

Regex produces different result in javascript - javascript

Why does this regex return an entirely different result in javascript as compared to an on-line regex tester, found at http://www.gskinner.com/RegExr/
var patt = new RegExp(/\D([0-9]*)/g);
"/144444455".match(patt);
The return in the console is:
["/144444455"]
While it does return the correct group in the regexr tester.
All I'm trying to do is extract the first amount inside a piece of text. Regardless if that text starts with a "/" or has a bunch of other useless information.

The regex does exactly what you tell it to:
\D matches a non-digit (in this case /)
[0-9]* matches a string of digits (144444455)
You will need to access the content of the first capturing group:
var match = patt.exec(subject);
if (match != null) {
result = match[1];
}
Or simply drop the \D entirely - I'm not sure why you think you need it in the first place...
Then, you should probably remove the /g modifier if you only want to match the first number, not all numbers in your text. So,
result = subject.match(/\d+/);
should work just as well.

Related

check every occurrence of a special character followed by a whitespace using regex

I'm trying to check for every occurrence that a string has an # at the beginning of a string.
So something like this works for only one string occurance
const comment = "#barnowl is cool"
const regex = /#[a-z]/i;
if (comment.charAt(0).includes("#")) {
if (regex.test(comment)) {
// do something
console.log('test passeed')
} else {
// do something else
}
} else {
// do other
}
but....
What if you have a textarea and a user uses the # multiple times to reference another user this test will no longer work because charAt(0) is looking for the first character in a string.
What regex test is doable in a situation where you have to check the occurrence of a # followed by a space. I know i can ditch charAt(0) and use comment.includes("#") but i want to use a regex pattern to check if there is space after wards
So if user does #username followed by a space after words, the regex should pass.
Doing this \s doesn't seem to make the test pass
const regex = /#[a-z]\s/i; // shouldn't this check for white space after a letter ?
demo:
https://jsbin.com/riraluxape/edit?js,console

I think your expression is very close. There are two things that are missing:
The [a-z] match is only looking for one character, so in order to look for multiple characters it needs to be [a-z]+.
The flags section is missing the g modifier, which enables the expression to look through the entire text string instead of just the first match.
I believe the regular expression declaration should be adjusted to the following:
const regex = /#[a-z]+\s/ig;

Is this what you want? Matching all the occurrences of the mention?
const regex = /#\w+/ig
I used the \w flag here which matches any word character.

To check for multiple matches instead of only the first one, append g to the regex:
const regex = /#[a-z]*\s/ig;
Your regex with \s actually works, see: https://regex101.com/r/gyMyvB/1

regex match not outputting the adjacent matches javascript

i was experimenting on regex in javascript. Then i came across an issue such that let consider string str = "+d+a+", I was trying to output those characters in the string which are surrounded by +, I used str.match(/\+[a-z]\+/ig), so here what I'm expecting is ["+d+","+a+"], but what i got is just ["+d+"], "+a+" is not showing in the output. Why?

.match(/.../g) returns all non-overlapping matches. Your regex requires a + sign on each side. Given your target string:
+d+a+
^^^
^^^
Your matches would have to overlap in the middle in order to return "+a+".
You can use look-ahead and a manual loop to find overlapping matches:
var str = "+d+a+";
var re = /(?=(\+[a-z]\+))/g;
var matches = [], m;
while (m = re.exec(str)) {
matches.push(m[1]);
re.lastIndex++;
}
console.log(matches);

With regex, when a character gets consumed with a match, then it won't count for the next match.
For example, a regex like /aba/g wouldn't find 2 aba's in a string like "ababa".
Because the second "a" was already consumed.
However, that can be overcome by using a positive lookahead (?=...).
Because lookaheads just check what's behind without actually consuming it.
So a regex like /(ab)(?=(a))/g would return 2 capture groups with 'ab' and 'a' for each 'aba'.
But in this case it just needs to be followed by 1 fixed character '+'.
So it can be simplified, because you don't really need capture groups for this one.
Example snippet:
var str = "+a+b+c+";
var matches = str.match(/\+[a-z]+(?=\+)/g).map(function(m){return m + '+'});
console.log(matches);

How to extract a string that conforms to a regex?

Say I have a RegEx like the following:
^[a-zA-Z]\w{12}$
And I have the following string:
%7AgTy!5hG^vxWa2#AgW
I would like to "pull" out of that string something that conforms to that regex. In this example we would get the following:
AgTy5hGvxWa2A
Reason: it starts with A because the regex says the first letter must be [a-zA-Z] (so it skips the first 2 characters), and then it pulls successive \ws out until it reaches 12 characters.
Is this sort of thing possible?
Edit: My apologies for being unclear. I'm not looking for a new regular expression that will give the proper output. Rather, I'm looking for a way to use the existing RegEx to extract the proper output. In my program these regular expressions are entered by hand by the user to extract a password from a long base256 hash such that it will conform to these existing password requirement regexes.

Instead of trying to match what you want and reconstructing the string, replace everything you don't want with nothing. This gives the impression that you're extracting what you need, but, in fact, it's doing the opposite; gets rid of everything you don't want to extract. I also dropped $ from the end of your original pattern otherwise it'll never match the string you present in your question.
See regex in use here
^[^a-z]+|\W+
^ Assert position at the start of the line
[^a-z]+ Matches any character that is not in the range a-z one or more times. Since the i flag is specified, this also matches A-Z
\W+ Match any non-word character one or more times
const regex = /^[^a-z]+|\W+/gi
const a = [
`%7AgTy!5hG^vxWa2#AgW`,
`%7AgTy!5hG^vxWa2#`
]
a.forEach(function(s) {
var clean = s.replace(regex, '')
var match = clean.match(/^[a-z]\w{12}/i)
console.log(match)
})

Escape single backslash inbetween non-backslash characters only

I have some input coming in a web page which I will re display and submit elsewhere. The current issue is that I want to double up all single backslashes that are sandwiched inbetween non-backslash characters before submitting the input elsewhere.
Test string "domain\name\\nonSingle\\\WontBe\\\\Returned", I want to only get the first single backslash, between domain and name.
This string should get nothing "\\get\\\nothing\\\\"
My current pattern that I can get closest with is [\w][\\](?!\\) however this will get the "\n" from the 1st test string i have listed. I would like to use lookbehind for the regex however javascript does not have such a thing for the version I am using. Here is the site I have been testing my regexs on http://www.regexpal.com/
Currently I am inefficiently using this regex [\w][\\](?!\\) to extract out all single backslashes sandwiched between non-backslash characters and the character before them (which I don't want) and then replacing it with the same string plus a backslash at the end of it.
For example given domain\name\\bl\\\ah my current regex [\w][\\]\(?!\\) will return "n\". This results in my code having to do some additional processing rather than just using replace.
I don't care about any double, triple or quadruple backslashes present, they can be left alone.

For example given domain\name\\bl\\\ah my current regex [\w][\\]\(?!\\) will return "n\". This results in my code having to do some additional processing rather than just using replace.
It will do just using replace, since you can insert the matched substring with $&, see:
console.log(String.raw`domain\name\\bl\\\ah`.replace(/\w\\(?!\\)/g, "$&\\"))

Easiest method of matching escapes, is to match all escaped characters.
\\(.)
And then in the replacement, decide what to do with it based on what was captured.
var s = "domain\\name\\\\backslashesInDoubleBackslashesWontBeReturned";
console.log('input:', s);
var r = s.replace(/\\(.)/g, function (match, capture1) {
return capture1 === '\\' ? match : '$' + capture1;
});
console.log('result:', r);
The closest you can get to actually matching the unescaped backslashes is
((?:^|[^\\])(?:\\\\)*)\\(?!\\)
It will match an odd number of backslashes, and capture all but the last one into capture group 1.
var re = /((?:^|[^\\])(?:\\\\)*)\\(?!\\)/g;
var s = "domain\\name\\\\escapedBackslashes\\\\\\test";
var parts = s.split(re);
console.dir(parts);
var cleaned = [];
for (var i = 1; i < parts.length; i += 2)
{
cleaned.push(parts[i-1] + parts[i]);
}
cleaned.push(parts[parts.length - 1]);
console.dir(cleaned);
The even-numbered (counting from zero) items will be unmatched text. The odd-numbered items will be the captured text.
Each captured text should be considered part of the preceding text.

match a string not after another string

This
var re = /[^<a]b/;
var str = "<a>b";
console.log(str.match(re)[0]);
matches >b.
However, I don't understand why this pattern /[^<a>]b/ doesn't match anything. I want to capture only the "b".

The reason why /[^<a>]b/ doesn't do anything is that you are ignoring <, a, and > as individual characters, so rewriting it as /[^><a]b/ would do the same thing. I doubt this is what you want, though. Try the following:
var re = /<a>(b)/;
var str = "<a>b";
console.log(str.match(re)[1]);
This regex looks for a string that looks like <a>b first, but it captures the b with the parentheses. To access the b, simply use [1] when you call .match instead of [0], which would return the entire string (<a>b).

What you're using here is a match for a b preceded by any character that is not listed in the group. The syntax [^a-z+-] where the a-z+- is a range of characters (in this case, the range of the lowercase Latin letters, a plus sign and a minus sign). So, what your regex pattern matches is any b preceded by a character that is NOT < or a. Since > doesn't fall in that range, it matches it.
The range selector basically works the same as a list of characters that are seperated by OR pipes: [abcd] matches the same as (a|b|c|d). Range selectors just have an extra functionality of also matching that same string via [a-d], using a dash in between character ranges. Putting a ^ at the start of a range automatically turns this positive range selector into a negative one, so it will match anything BUT the characters in that range.
What you are looking for is a negative lookahead. Those can exclude something from matching longer strings. Those work in this format: (?!do not match) where do not match uses the normal regex syntax. In this case, you want to test if the preceding string does not match <a>, so just use:
(?!<a>)(.{3}|^.{0,2})b
That will match the b when it is either preceded by three characters that are not <a>, or by fewer characters that are at the start of the line.
PS: what you are probably looking for is the "negative lookbehind", which sadly isn't available in JavaScript regular expressions. The way that would work is (?<!<a>)b in other languages. Because JavaScript doesn't have negative lookbehinds, you'll have to use this alternative regex.

you could write a pattern to match anchor tag and then replace it with empty string
var str = "<a>b</a>";
str = str.replace(/((<a[\w\s=\[\]\'\"\-]*>)|</a>)/gi,'')
this will replace the following strings with 'b'
<a>b</a>
<a class='link-l3'>b</a>
to better get familiar with regEx patterns you may find this website very useful regExPal

Your code :
var re = /[^<a>]b/;
var str = "<a>b";
console.log(str.match(re));
Why [^<a>]b is not matching with anything ?
The meaning of [^<a>]b is any character except < or a or > then b .
Hear b is followed by > , so it will not match .
If you want to match b , then you need to give like this :
var re = /(?:[\<a\>])(b)/;
var str = "<a>b";
console.log(str.match(re)[1]);
DEMO And EXPLANATION

Develop Reference

JavaScript is the programming language of the Web.

Regex produces different result in javascript - javascript

Related

check every occurrence of a special character followed by a whitespace using regex

regex match not outputting the adjacent matches javascript

How to extract a string that conforms to a regex?

Escape single backslash inbetween non-backslash characters only

match a string not after another string

Categories

Resources