Regex thinks I'm nesting, but I'm not - javascript

I wrote this regexp to capture the strings below.
\!\[(.*?)?\]
All the strings below should match and return an optional string that's inside the first set of square brackets.
![]
![caption]
![]()
![caption]()
![caption][]
The problem is that this string also matches and returns ][ because the regex thinks it's between the first [ and last ].
![][] // Should not match, but does and returns "]["
How do I fix this?

Just remove the ? outside (.*?), that is redundant.
var myArray = ["![abc]","![caption]", "![def]()", "![caption]()","![caption][]"];
myArray.forEach(function(current) {
console.log(/!\[(.*?)\]/.exec(current)[1]);
});
Output
abc
caption
def
caption
caption
Check how the RegEx works here

Use this regex:
\!\[([^\]]*)\]
It means that it expects a "last" ] but makes internal ones invalid.
This should solve your issue.

My preference is this if you want to ignore catching the things like this ![[]]
\!\[([^\[\]]*)\]

Related

Or Condition in a regular expression

I have a string in which I need to get the value between either "[ValueToBeFetched]" or "{ValueToBeFetched}".
var test = "I am \"{now}\" doing \"[well]\"";
test.match(/"\[.*?]\"/g)
the above regex serves the purpose and gets the value between square brackets and I can use the same for curly brackets also.
test.match(/"\{.*?}\"/g)
Is there a way to keep only one regex and do this, something like an or {|[ operator in regex.
I tried some scenarios but they don't seem to work.
Thanks in advance.
You could try following regex:
(?:{|\[).*?(?:}|\])
Details:
(?:{|\[): Non-capturing group, gets character { or [
.*?: gets as few as possible
(?:}|\]): Non-capturing group, gets character } or ]
Demo
Code in JavaScript:
var test = "I am \"{now}\" doing \"[well]\"";
var result = test.match(/"(?:{|\[).*?(?:}|\])"/g);
console.log(result);
Result:
["{now}", "[well]"]
As you said, there is an or operator which is |:
[Edited as suggested] Let's catch all sentences that begins with an "a" or a "b" :
/^((a|b).*)/gm
In this example, if the line parsed begins with a or b, the entire sentence will be catched in the first result group.
You may test your regex with an online regex tester
For your special case, try something like that, and use the online regex tester i mentionned before to understand how it works:
((\[|\{)\w*(\]|\}))

Javascript Regex - Get All string that has square brackets []

I have string data below:
var data = "somestring[a=0]what[b-c=twelve]----[def=one-2]test"
I need to get all strings that contain square brackets []. This is the result that I want.
["[a=0]", "[b-c=twelve]", "[def=one-2]"]
I've tried using regex /\[(.*?)\]/, but what I've got is an only the first array element is correct, the next elements are basically the same value but without the square brackets.
data.match(/\[(.*?)\]/);
// result => ["[a=0]", "a=0"]
What regexp should I use to achieve the result that I want? Thank you in advance.
You want to use the g (global) modifier to find all matches. Since the brackets are included in the match result you don't need to use a capturing group and I used negation instead to eliminate the amount of backtracking.
someVar.match(/\[[^\]]*]/g);
In /\[(.*?)\]/, *? means lazy, matching as few content as possible.
What you actually want is all the matches in content. Try modifier g
Try this one, http://regex101.com/r/aD6cM8/1. Any match starts with [, ends with ], but doesn't allow[ or ] inbetween.
someVar.match(/\[([^\[\]]*)\]/g)
You should just add the g switch to your regex :
someVar.match(/\[(.*?)\]/); // result => ["[a=0]", "a=0"]
results in
[ "[a=0]", "[b-c=twelve]", "[def=one-2]" ]
Your regex is correct, just suffix g to it to make it global:
someVar.match(/\[(.*?)\]/g);
Here's more info on it: http://www.w3schools.com/jsref/jsref_regexp_g.asp

Extract specific chars from a string using a regex

I need to split an email address and take out the first character and the first character after the '#'
I can do this as follows:
'bar#foo'.split('#').map(function(a){ return a.charAt(0); }).join('')
--> bf
Now I was wondering if it can be done using a regex match, something like this
'bar#foo'.match(/^(\w).*?#(\w)/).join('')
--> bar#fbf
Not really what I want, but I'm sure I miss something here! Any suggestions ?
Why use a regex for this? just use indexOf to get the char at any given position:
var addr = 'foo#bar';
console.log(addr[0], addr[addr.indexOf('#')+1])
To ensure your code works on all browsers, you might want to use charAt instead of []:
console.log(addr.charAt(0), addr.charAt(addr.indexOf('#')+1));
Either way, It'll work just fine, and This is undeniably the fastest approach
If you are going to persist, and choose a regex, then you should realize that the match method returns an array containing 3 strings, in your case:
/^(\w).*?#(\w)/
["the whole match",//start of string + first char + .*?# + first string after #
"groupw 1 \w",//first char
"group 2 \w"//first char after #
]
So addr.match(/^(\w).*?#(\w)/).slice(1).join('') is probably what you want.
If I understand correctly, you are quite close. Just don't join everything returned by match because the first element is the entire matched string.
'bar#foo'.match(/^(\w).*?#(\w)/).splice(1).join('')
--> bf
Using regex:
matched="",
'abc#xyz'.replace(/(?:^|#)(\w)/g, function($0, $1) { matched += $1; return $0; });
console.log(matched);
// ax
The regex match function returns an array of all matches, where the first one is the 'full text' of the match, followed by every sub-group. In your case, it returns this:
bar#f
b
f
To get rid of the first item (the full match), use slice:
'bar#foo'.match(/^(\w).*?#(\w)/).slice(1).join('\r')
Use String.prototype.replace with regular expression:
'bar#foo'.replace(/^(\w).*#(\w).*$/, '$1$2'); // "bf"
Or using RegEx
^([a-zA-Z0-9])[a-zA-Z0-9.!#$%&'*+\/=?^_`{|}~-]+#([a-zA-Z0-9-])[a-zA-Z0-9-]+(?:\.[a-zA-Z0-9-]+)*$
Fiddle

Regex multiple matches for HTML attributes

I want to match multiple data-i18n attributes with a JavaScript regexp.
I tried the following regexp :
var regexp = /(data\-i18n="[^"]+")/g;
which in my head seemed rather straight forward, but it ended up not working.
If you try to match on the following HTML tag :
<a random-attr="ok" data-i18n="first match" data-i18n="second match">my text</a>
doing an exec like this :
/(data\-i18n="[^"]+")/g.exec('<a random-attr="ok" data-i18n="first match" data-i18n="second match">my text</a>')
will raise the following issue :
There are two matches, but they are actually duplicate matches.
The result is :
[ 'data-i18n="first match"',
'data-i18n="first match"',
index: 20,
input: '<a random-attr="ok" data-i18n="first match" data-i18n="second match">my text</a>' ]
Any ideas on how to have multiple matches for my attribute ?
Thanks in advance !
The problem isn't with your regex; its with how you're expecting exec to behave. The return value of exec has the full match at position 0, and then the match of each capture group following that. Since you wrapped the whole regex in a capturing group, you're seeing the same string at positions 0 and 1 of the array.
The right way to use a global regex with exec is to keep calling exec until it returns null; it will return the next match each time. However, if you use String.match(Regexp), it will return what you expect - an array containing all of the matches.

Javascript RegExp matching returning too many

I need to take a string and get some values from it. I have this string:
'tab/tab2/tab3'
The '/tab3' is optional so this string should also work:
'tab/tab2'
I currently am trying this which works for the most part:
'tab/tab2/tab3'.match(new RegExp('^tab/([%a-zA-Z0-9\-\_\s,]+)(/([%a-zA-Z0-9-_s,]+)?)$'));
This will return:
["tab/tab2/tab3", "tab2", "/tab3", "tab3"]
but I want it to return
["tab/tab2/tab3", "tab2", "tab3"]
So I need to get rid of the 3rd index item ("/tab3") and also get it to work with just the 'tab/tab2' string.
To complicate it even more, I only have control over the /([%a-zA-Z0-9-_s,]+)? part in the last grouping meaning it will always wrap in a grouping.
you don't need regex for this, just use split() method:
var str = 'tab/tab2/tab3';
var arr = str.split('/');
console.log(arr[0]); //tab
console.log(arr[1]); //tab2
jsfiddle
I used this regexp to do this:
'tab/tab2/tab3'.match(new RegExp('^tab/([%a-zA-Z0-9\-\_\s,]+)(?:/)([%a-zA-Z0-9-_s,]+)$'));
Now I get this return
["tab/tab2/tab3", "tab2", "tab3"]
Now I just need to allow 'tab/tab2' to be accepted aswell...
Do not put regex between " or ', using /g to make global search else only first occurrence is returned
"tab/tab2/tab3".match(/tab[0-9]/g)

Categories

Resources