JavaScript regular expression to have unique contents - javascript

I want a regular expression which allows
__$1__, __$2__, ... __$9__
or
__$an alphanumeric word up to 6 characters__
in a string...
I have tried with below expression but it's not working as required:
/^.*(\_\_\\$[1-9]{1}\_\_|\_\_\\$[a-zA-Z0-9]{0,6}\_\_)\1{1}.*$/;
Also, there should not be any repeated $ content.

I'd go with:
/__\$([0-9]|[A-z0-9]{1,6})__/
This should fit your requirements except for:
Also, there should not be any repeated $ content.
I guess this can't be accomplished using just Regular Expressions, at least as far as I know...

How about this?
/__\$([0-9]|[A-z0-9]{1,6})__/
or
/__\$([0-9]|[A-z]{1}[A-z0-9]{0,5})__/

Related

Regex testing for special characters

I'm trying to write a regex to test for certain special characters, but I think I am overcomplicating things. The characters I need to check for are: &<>'"
My current regex looks like such:
/&<>'"/
Another I was trying is:
/\&\<\>\'\"/
Any tips for a beginner (in regards to regex)? Thanks!
You are looking for a character class:
/[&<>'"]/
In doing so, any of the characters in the square brackets will be matched.
The expression you were originally using, /&<>'"/, wasn't working as expected because it matches the characters in that sequential order. In other words, it would match a full string such as &<>'" but not &<.
I'm assuming that you want to be able to match all of the characters you listed, at one time.
If so, you should be able to combine a character set with the g (global-matching) flag, for your regex.
Here's what it could look like:
/[<>&'"]/g
Try /(\&|\<|>|\'|\")/
it depends on what regex system you use

JS regular expression to match imdb url

Can any one tell me what is wrong with this javascript code
"http://www.imdb.com/title/tt2618986/".match("~http://(?:.*\.|.*)imdb.com/(?:t|T)itle(?:\?|/)(..\d+)~i");
When i try this here https://regex101.com/r/yT7bG4/1 it works but not in javascript
The way you create a regular expression in JavaScript is /pattern/flags. The code you are looking for is something along the lines of:
"http://www.imdb.com/title/tt2618986/".match(/http:\/\/(?:.*\.|.*)imdb.com\/(?:t|T)itle(?:\?|\/)(..\d+)/i);
You have to escape all of the / in the regular expression so the / become part of the regular expression instead of indicating the end of it. I would suggest reading this article if you want to learn more about regular expressions in JavaScript.
Also, https://regex101.com/ has a JavaScript option on the left, under the 'FLAVOR' banner, which may help knowing which flags are valid.
You are using pcre(php) flavor in regex101. You should select javascript flavor.
Considers that there is not '~' delimiter in javascript RegExp. This is why your code is not working.
You should write something like:
"http://www.imdb.com/title/tt2618986/".match(/http:\/\/(?:.*\.|.*)\.imdb.com\/(?:t|T)itle(?:\?|\/)(..\d+)/i);
In your case:
/ symbol must be escapes - like this /.
there is not '~' delimiter
Result code with regular expression is:
"http://www.imdb.com/title/tt2618986/".match(/http:\/\/(?:.*\.|.*)imdb.com\/title(?:\?|\/)(..\d+)/i)
p.s. use modifier 'i' to do a case-insensitive search

Solving regular expression recursive strings

The Problem
I could match this string
(xx)
using this regex
\([^()]*\)
But it wouldn't match
(x(xx)x)
So, this regex would
\([^()]*\([^()]*\)[^()]*\)
However, this would fail to match
(x(x(xx)x)x)
But again, this new regex would
[^()]*\([^()]*\([^()]*\)[^()]*\)[^()]*
This is where you can notice the replication, the entire regex pattern of the second regex after the first \( and before the last \) is copied and replaces the center most [^()]*. Of course, this last regex wouldn't match
(x(x(x(xx)x)x)x)
But, you could always copy replace the center most [^()]* with [^()]*\([^()]*\)[^()]* like we did for the last regex and it'll capture more (xx) groups. The more you add to the regex the more it can handle, but it will always be limited to how much you add.
So, how do you get around this limitation and capture a group of parenthesis (or any two characters for that matter) that can contain extra groups within it?
Falsely Assumed Solutions
I know you might think to just use
\(.*\)
But this will match all of
(xx)xx)
when it should only match the sub-string (xx).
Even this
\([^)]*\)
will not match pairs of parentheses that have pairs nested like
(xx(xx)xx)
From this, it'll only match up to (xx(xx).
Is it possible?
So is it possible to write a regex that can match groups of parentheses? Or is this something that must be handled by a routine?
Edit
The solution must work in the JavaScript implementation of Regular Expressions
If you want to match only if the round brackets are balanced you cannot do it by regex itself..
a better way would be to
1>match the string using \(.*\)
2>count the number of (,) and check if they are equal..if they are then you have the match
3>if they are not equal use \([^()]*\) to match the required string
Formally speaking, this isn't possible using regular expressions! Regular expressions define regular languages, and regular languages can't have balanced parenthesis.
However, it turns out that this is the sort of thing people need to do all the time, so lots of Regex engines have been extended to include more than formal regular expressions. Therefore, you can do balanced brackets with regular expressions in javascript. This article might help get you started: http://weblogs.asp.net/whaggard/archive/2005/02/20/377025.aspx . It's for .net, but the same applies for the standard javascript regex engine.
Personally though, I think it's best to solve a complex problem like this with your own function rather than leveraging the extended features of a Regex engine.

Get Part of String

I am not good at Regular expression and couldn't find an easy way for this problem.
i have an expression like:
TR_NN_Expression
Where NN is a number of 2 digits, and Expression can contain '_', so i can't use split for this. I would like to get the Expression. Any help would be greater appreciated.
You can use this regular expression:
TR_[0-9]{2}_(.*)
The part you want will be in the capturing group. Example usage:
> s = 'TR_01_My##34_Expresion'
"TR_01_My##34_Expresion"
> s.match(/TR_[0-9]{2}_(.*)/)[1]
"My##34_Expresion"
I always use and recommend this tool, It makes our life to easier,
Interactive multi-language regular expression generator
Enjoy!
If the prefix is of fixed length and you know that the strings are of the correct format you can just use substring to accomplish this.
"TR_42_some_expression_here".substring(6) // yields "some_expression_here"
If you have a more complicated situation, regular expressions may be appropriate. The exact expression depends on what you wish to capture.

Match altered version of first match with only one expression?

I'm writing a brush for Alex Gorbatchev's Syntax Highlighter to get highlighting for Smalltalk code. Now, consider the following Smalltalk code:
aCollection do: [ :each | each shout ]
I want to find the block argument ":each" and then match "each" every time it occurrs afterwards (for simplicity, let's say every occurrence an not just inside the brackets).
Note that the argument can have any name, e.g. ":myArg".
My attempt to match ":each":
\:([\d\w]+)
This seems to work. The problem is for me to match the occurrences of "each". I thought something like this could work:
\:([\d\w]+)|\1
But the right hand side of the alternation seems to be treated as an independent expression, so backreferencing doesn't work.
Is it even possible to accomplish what I want in a single expression? Or would I have to use the backreference within a second expression (via another function call)?
You could do it in languages that support variable-length lookbehind (AFAIK only the .NET framework languages do, Perl 6 might). There you could highlight a word if it matches (?<=:(\w+)\b.*)\1. But JavaScript doesn't support lookbehind at all.
But anyway this regex would be very inefficient (I just checked a simple example in RegexBuddy, and the regex engine needs over 60 steps for nearly every character in the document to decide between match and non-match), so this is not a good idea if you want to use it for code highlighting.
I'd recommend you use the two-step approach you mentioned: First match :(\w+)\b (word boundary inserted for safety, \d is implied in \w), then do a literal search for match result \1.
I believe the only thing stored by the Regex engine between matches is the position of the last match. Therefore, when looking for the next match, you cannot use a backreference to the match before.
So, no, I do not think that this is possible.

Categories

Resources