regex match all `,` not inside \[\] - javascript

Here is the example string which I have to match:
var sampleStr = "aaa[bbb=55,zzz=ddd],#ddd[ppp=33,kk=77,rr=fff],tt,ff";
I need to write regex that will match all , characters which is not inside [ ]
so In my sample string I should receive the next , characters:
- `,` before `#ddd`
- `,` before `tt`
- `,` before `ff`
and it should ignore next ,:
- `,` before `zzz`
- `,` before `kk`
- `,` before `rr`
Actually I have no idea how to ignore those , inside [...].
Big thx for any advance

If you can assume that the part inside [] doesn't contain nested [], and the [] are balanced:
var out = content.split(/,(?![^\[\]]*\])/);
(?![^\[\]]*\]) is a negative look-ahead which checks that we are not inside [] with a heuristic. As long as we don't encounter any ] as we consume characters other than [ and ], then we are outside [].
The code above will split the text along those commas , outside brackets [] and return the tokens.

This regex should work
,(?![^\[]*?\])
see: DEMO
Explanation
, is our target comma,
(?![^\[]*?\]) use negative lookahead to guarantee that there is no ] after ,, a trick here is instead of using .* we use [^\[]* to prevent regex match a pattern [...] instead of ..].

One way to avoid commas enclosed in square brackets is to match square brackets first. Example for a replacement:
var result = sampleStr.replace(/([^\[,]*(?:\[[^\]]*\][^\[,]*)*),/g, '$1#');
Other example if you want to split:
var result = sampleStr.match(/(?=[^,])[^\[,]*(?:\[[^\]]*\][^\[,]*)*/g);
The advantage of these approaches is that you don't need to parse all the string until the end with a lookahead for each comma.

Related

RegEx for matching two single quotes from start and end of strings

I need to replace two quotes at the beginning and end of elements with one quote.
My code is:
const regex = /['']/g;
let categories = [];
let categoryArray = data.categories.split(','); // ''modiles2'', ''Men''s Apparel'', ''Women''s Apparel'', ''Jeans'', ''Sneakers''
for (let value of Object.values(categoryArray)) {
categories.push(categoryArray[i].replace(regex, '\''));
}
}
return categories
// What should be 'modiles2','Men''s Apparel','Women''s Apparel','Jeans','Sneakers'
// What comes back ''modiles2'',''Men''s Apparel'',''Women''s Apparel'',''Jeans'',''Sneakers''
My regular expression replaces only the first of two quotation marks with one quotation mark.
How do I solve this problem?
You need to match two quotation marks - use this regex:
const regex = /^'.*'$/g;
Character classes look for any single character between the brackets. Character classes that have duplicate characters, like [''], can do without such duplication.
"''modiles2'',''Men''s Apparel'',''Women''s Apparel'',''Jeans'',''Sneakers''"
.split(',')
.forEach(s => console.log( s.replace(/^''|''$/g, '\'') ))
This RegEx might help to do so. You might consider using \x27 instead of ' just to be safe.
^(\x27{2,})(.+)(\x27{2,})$
This RegEx creates three groups, just to be simple, and you might use $2, which is your target output.
Your pattern uses a character class [''] which matches any of the listed which could be written as ['] which is just '
If there can be more than 2 at the start or at the end, you could also use a quantifier {2,} to match 2 or more single quotes and replace with a single quote. To match exactly 2 single quotes use ''.
^'{2,}|'{2,}$
^'{2,} Match 2 or more quotes at the start of the string
| Or
'{2,}$ Match 2 or more single quotes at the end of the string
Regex demo
Use the /g global flag to replace all occurrences.
console.log("''modiles2'',''Men''s Apparel'',''Women''s Apparel'',''Jeans'',''Sneakers''"
.split(',')
.map(s => s.replace(/^'{2,}|'{2,}$/g, "'")))
Here is my two cents, in one line:
return data.categories.split(',').map(val => val.trim().replace(/^'{2,}|'{2,}$/, "'"));

A regexp in Javascript that will capture a word only if it is NOT inside brackets [duplicate]

Here is the example string which I have to match:
var sampleStr = "aaa[bbb=55,zzz=ddd],#ddd[ppp=33,kk=77,rr=fff],tt,ff";
I need to write regex that will match all , characters which is not inside [ ]
so In my sample string I should receive the next , characters:
- `,` before `#ddd`
- `,` before `tt`
- `,` before `ff`
and it should ignore next ,:
- `,` before `zzz`
- `,` before `kk`
- `,` before `rr`
Actually I have no idea how to ignore those , inside [...].
Big thx for any advance
If you can assume that the part inside [] doesn't contain nested [], and the [] are balanced:
var out = content.split(/,(?![^\[\]]*\])/);
(?![^\[\]]*\]) is a negative look-ahead which checks that we are not inside [] with a heuristic. As long as we don't encounter any ] as we consume characters other than [ and ], then we are outside [].
The code above will split the text along those commas , outside brackets [] and return the tokens.
This regex should work
,(?![^\[]*?\])
see: DEMO
Explanation
, is our target comma,
(?![^\[]*?\]) use negative lookahead to guarantee that there is no ] after ,, a trick here is instead of using .* we use [^\[]* to prevent regex match a pattern [...] instead of ..].
One way to avoid commas enclosed in square brackets is to match square brackets first. Example for a replacement:
var result = sampleStr.replace(/([^\[,]*(?:\[[^\]]*\][^\[,]*)*),/g, '$1#');
Other example if you want to split:
var result = sampleStr.match(/(?=[^,])[^\[,]*(?:\[[^\]]*\][^\[,]*)*/g);
The advantage of these approaches is that you don't need to parse all the string until the end with a lookahead for each comma.

how to exclude escaped characters in split() javascript

I am looking to ignore the characters inside the square brackets because it matches with my split parameters.
The string that i want to split is
var str = "K1.1.[Other] + K1.2A.[tcc + K*-=>]";
var split = str.split(/[+|,|*|/||>|<|=|-]+/);
I want the output as K1.1.[Other], K1.2A.[tcc + K*-=>].
But this above code is including the characters inside square brackets which i don't want to consider. Any suggestion on how to solve this?
Split on the following pattern: /\+(?![^\[]*\])/
https://regex101.com/r/NZKaKD/1
Explanation:
\+ - A literal plus sign
(?! ... ) - Negative lookahead (don't match the previous character/group if it is followed by the contents of this block)
[^\[]* - Any number of non-left-square-brackets
\] - A literal right square bracket
split by both plus and braces as well. then go through chunks and join everything between braces pairs.
But better not to use regexp at all for that.

Remove text between square brackets at the end of string

I need a regex to remove last expression between brackets (also with brackets)
source: input[something][something2]
target: input[something]
I've tried this, but it removes all two:
"input[something][something2]".replace(/\[.*?\]/g, '');
Note that \[.*?\]$ won't work as it will match the first [ (because a regex engine processes the string from left to right), and then will match all the rest of the string up to the ] at its end. So, it will match [something][something2] in input[something][something2].
You may specify the end of string anchor and use [^\][]* (matching zero or more chars other than [ and ]) instead of .*?:
\[[^\][]*]$
See the JS demo:
console.log(
"input[something][something2]".replace(/\[[^\][]*]$/, '')
);
Details:
\[ - a literal [
[^\][]* - zero or more chars other than [ and ]
] - a literal ]
$ - end of string
Another way is to use .* at the start of the pattern to grab the whole line, capture it, and the let it backtrack to get the last [...]:
console.log(
"input[something][something2]".replace(/^(.*)\[.*]$/, '$1')
);
Here, $1 is the backreference to the value captured with (.*) subpattern. However, it will work a bit differently, since it will return all up to the last [ in the string, and then all after that [ including the bracket will get removed.
Do not use the g modifier, and use the $ anchor:
"input[something][something2]".replace(/\[[^\]]*\]$/, '');
try this code
var str = "Hello, this is Mike (example)";
alert(str.replace(/\s*\(.*?\)\s*/g, ''));

match until an unescaped version of a character

Am processing a string format like [enclosed str]outer str[enclosed str]
and am trying to match all [enclosed str].
The problem is that I want any character except an unescaped version of ](that is a ] not preceded by a \) to be within the square brackets.
For instance
str = 'string[[enclosed1\\]]string[enclosed2]';
// match all [ followed by anything other ] then a ]
str.match(/\[[^\]]+]/g)
// returns ["[[enclosed1\]", "[enclosed2]"]
// ignores the `]` after `\\]`
// match word and non-word char enclosed by []
str.match(/\[[\w\W]+]/g)
// returns ["[[enclosed1\]]string[enclosed2]"]
// matches to the last ]
// making it less greedy with /\[[\w\W]+?]/g
// returns same result as /\[[^\]]+]/g
Is it possible within Javascript RegExp to achieve my desired result which is
["[[enclosed1\]]", "[enclosed2]"]
With regex in javascript not supporting a negative lookbehind this is the best I could come up with:
/(?:^|[^\\])(\[.*?[^\\]\])/g
group 1 will contain the string you want.
https://regex101.com/r/PmDcGH/3

Categories

Resources