how to found 2 matches in regular expression - javascript

I need a regular expression for :
<<12.txt>> <<45.txt>
I have created a regular expression :
<<.+.txt>>
But this found one match in whole string but here is 2 matches:
<<12.txt>>
<<45.txt>>
if anyone have solution for this problem please help me out there

Part of the issue is that the string you've specified wouldn't match because the second > is missing in <<45.txt>.
Also, you're using the . (dot) selector, and also trying to find a period. It works, but now how you think it is.
Here's the regex you want:
var regex = /<<\d+\.txt>>/g
\d matches only numbers
\. matches an actual period
/g means global, so it won't stop at the first match
Practice Regular Expressions
https://regexr.com/43bs4
Demo
var string = "<<12.txt>> <<45.txt>>";
var regex = /<<\d+\.txt>>/g;
var matches = string.match(regex);
console.log(matches);
P.S., if you actually want to match with 1 > or 2 >>, you can with:
var regex = /<<\d+\.txt>>?/g
? optionally matches the character right before it

/<<.+.txt>>/gm
g is for global (will search through entire source)
m is for multi line search support

Related

Javascript - Regular Expression with string template followed by 4 digits number?

Good day. I wanna detect the url string in the <a> tag
Link
whether it matchs the pattern : ?post_type=tribe_events&p=#### (#### = 4 digits number)
I'm writing some Jquery code to detect the expression but the console is throwing the error :
Invalid regular expression: /^(?)post_type=tribe_events&p=^(d{4})/:
Invalid group
var str = $(a).attr("href");
var regexEx = /^(?)post_type=tribe_events&p=^(d{4})/;
var ok = regexEx.exec(str);
console.log(ok);
I'm not good at the regex so I'd be aprreciated if there's any help.
There are couple of issues in your regex.
You need to remove ^ from your regex which denotes start of string and in your case your string doesn't actually start from a ? and is in middle of the string.
You need to escape ? as it has special meaning in regex which is zero or one occurrence of a character.
You need to remove second ^ after p= which isn't needed
You need to write \d and not just d for representing a number.
Also you don't need to group ? and \d{4} unless you really need them.
You corrected regex becomes,
\?post_type=tribe_events&p=\d{4}
Demo
If the test is really what you want, I suppose the right syntax would be:
/^\?post_type=tribe_events&p=\d{4}/

Javascript: Remove trailing chars from string if they are non-numeric

I am passing codes to an API. These codes are alphanumeric, like this one: M84.534D
I just found out that the API does not use the trailing letters. In other words, the API is expecting M84.534, no letter D at the end.
The problem I am having is that the format is not the same for the codes.
I may have M84.534DAC, or M84.534.
What I need to accomplish before sending the code is to remove any non-numeric characters from the end of the code, so in the examples:
M84.534D -> I need to pass M84.534
M84.534DAC -> I also need to pass M84.534
Is there any function or regex that will do that?
Thank you in advance to all.
You can use the regex below. It will remove anything from the end of the string that is not a number
let code = 'M84.534DAC'
console.log(code.replace(/[^0-9]+?$/, ""));
[^0-9] matches anything that is not a numer
+? Will match between 1 and unlimited times
$ Will match the end of the string
So linked together, it will match any non numbers at the end of the string, and replace them with nothing.
You could use the following expression:
\D*$
As in:
var somestring = "M84.534D".replace(/\D*$/, '');
console.log(somestring);
Explanation:
\D stands for not \d, the star * means zero or more times (greedily) and the $ anchors the expression to the end of the string.
Given your limited data sample, this simple regular expression does the trick. You just replace the match with an empty string.
I've used document.write just so we can see the results. You use this whatever way you want.
var testData = [
'M84.534D',
'M84.534DAC'
]
regex = /\D+$/
testData.forEach((item) => {
var cleanValue = item.replace(regex, '')
document.write(cleanValue + '<br>')
})
RegEx breakdown:
\D = Anything that's not a digit
+ = One or more occurrences
$ = End of line/input

match a string not after another string

This
var re = /[^<a]b/;
var str = "<a>b";
console.log(str.match(re)[0]);
matches >b.
However, I don't understand why this pattern /[^<a>]b/ doesn't match anything. I want to capture only the "b".
The reason why /[^<a>]b/ doesn't do anything is that you are ignoring <, a, and > as individual characters, so rewriting it as /[^><a]b/ would do the same thing. I doubt this is what you want, though. Try the following:
var re = /<a>(b)/;
var str = "<a>b";
console.log(str.match(re)[1]);
This regex looks for a string that looks like <a>b first, but it captures the b with the parentheses. To access the b, simply use [1] when you call .match instead of [0], which would return the entire string (<a>b).
What you're using here is a match for a b preceded by any character that is not listed in the group. The syntax [^a-z+-] where the a-z+- is a range of characters (in this case, the range of the lowercase Latin letters, a plus sign and a minus sign). So, what your regex pattern matches is any b preceded by a character that is NOT < or a. Since > doesn't fall in that range, it matches it.
The range selector basically works the same as a list of characters that are seperated by OR pipes: [abcd] matches the same as (a|b|c|d). Range selectors just have an extra functionality of also matching that same string via [a-d], using a dash in between character ranges. Putting a ^ at the start of a range automatically turns this positive range selector into a negative one, so it will match anything BUT the characters in that range.
What you are looking for is a negative lookahead. Those can exclude something from matching longer strings. Those work in this format: (?!do not match) where do not match uses the normal regex syntax. In this case, you want to test if the preceding string does not match <a>, so just use:
(?!<a>)(.{3}|^.{0,2})b
That will match the b when it is either preceded by three characters that are not <a>, or by fewer characters that are at the start of the line.
PS: what you are probably looking for is the "negative lookbehind", which sadly isn't available in JavaScript regular expressions. The way that would work is (?<!<a>)b in other languages. Because JavaScript doesn't have negative lookbehinds, you'll have to use this alternative regex.
you could write a pattern to match anchor tag and then replace it with empty string
var str = "<a>b</a>";
str = str.replace(/((<a[\w\s=\[\]\'\"\-]*>)|</a>)/gi,'')
this will replace the following strings with 'b'
<a>b</a>
<a class='link-l3'>b</a>
to better get familiar with regEx patterns you may find this website very useful regExPal
Your code :
var re = /[^<a>]b/;
var str = "<a>b";
console.log(str.match(re));
Why [^<a>]b is not matching with anything ?
The meaning of [^<a>]b is any character except < or a or > then b .
Hear b is followed by > , so it will not match .
If you want to match b , then you need to give like this :
var re = /(?:[\<a\>])(b)/;
var str = "<a>b";
console.log(str.match(re)[1]);
DEMO And EXPLANATION

Regular expression with asterisk quantifier

This documentation states this about the asterisk quantifier:
Matches the preceding character 0 or more times.
It works in something like this:
var regex = /<[A-Za-z][A-Za-z0-9]*>/;
var str = "<html>";
console.log(str.match(regex));
The result of the above is : <html>
But when tried on the following code to get all the "r"s in the string below, it only returns the first "r". Why is this?
var regex = /r*/;
var str = "rodriguez";
console.log(str.match(regex));
Why, in the first example does it cause "the preceding" character/token to be repeated "0 or more times" but not in the second example?
var regex = /r*/;
var str = "rodriguez";
The regex engine will first try to match r in rodriguez from left to right and since there is a match, it consumes this match.
The regex engine then tries to match another r, but the next character is o, so it stops there.
Without the global flag g (used as so var regex = /r*/g;), the regex engine will stop looking for more matches once the regex is satisfied.
Try using:
var regex = /a*/;
var str = "cabbage";
The match will be an empty string, despite having as in the string! This is because at first, the regex engine tries to find a in cabbage from left to right, but the first character is c. Since this doesn't match, the regex tries to match 0 times. The regex is thus satisfied and the matching ends here.
It might be worth pointing out that * alone is greedy, which means it will first try to match as many as possible (the 'or more' part from the description) before trying to match 0 times.
To get all r from rodriguez, you will need the global flag as mentioned earlier:
var regex = /r*/g;
var str = "rodriguez";
You'll get all the r, plus all the empty strings inside, since * also matches 'nothing'.
Use global switch to match 1 or more r anywhere in the string:
var regex = /r+/g;
In your other regex:
var regex = /<[A-Za-z][A-Za-z0-9]*>/;
You're matching literal < followed by a letter followed by 0 or more letter or digits and it will perfectly match <html>
But if you have input as <foo>:<bar>:<abc> then it will just match <foo> not other segments. To match all segments you need to use /<[A-Za-z][A-Za-z0-9]*>/g with global switch.

Regex to find last token on a string

I was wondering if there is a way having this
var string = "foo::bar"
To get the last part of the string: "bar" using just regex.
I was trying to do look-aheads but couldn't master them enough to do this.
--
UPDATE
Perhaps some examples will make the question clearer.
var st1 = "foo::bar::0"
match should be 0
var st2 = "foo::bar::0-3aab"
match should be 0-3aab
var st3 = "foo"
no match should be found
You can use a negative lookahead:
/::(?!.*::)(.*)$/
The result will then be in the capture.
Another approach:
/^.*::(.*)$/
This should work because the .* matches greedily, so the :: will match the last occurence of that string.
Simply,
/::(.+)$/
You can't use lookaheads unless you know exactly how long a string you're trying to match. Fortunately, this isn't an issue, because you're only looking at the end of the string $.
I wouldn't use regular expressions for this (although you certainly can); I'd split the string on ::, since that's conceptually what you want to do.
function lastToken(str) {
var xs = str.split('::');
return xs.length > 1 ? xs.pop() : null;
}
If you really want just a regular expression, you can use /::((?:[^:]|:(?!:))*)$/. First, it matches a literal ::. Then, we use parentheses to put the desired thing in capturing group 1. The desired thing is one or more copies of a (?:...)-bracketed string; this bracketing groups without capturing. We then look for either [^:], a non-colon character, or :(?!:), a colon followed by a non-colon. The (?!...) is a negative lookahead, which matches only if the next token doesn't match the contained pattern. Since JavaScript doesn't support negative lookbehinds, I can't see a good way to avoid capturing the :: as well, but you can wrap this in a function:
function lastTokenRegex(str) {
var m = str.match(/::((?:[^:]|:(?!:))*)$/);
return m && m[1];
}
var string2 = string.replace(/.*::/, "");
though perhaps string isn't the best choice of name for your string?

Categories

Resources