Regex solution for matching groups does not work

Regex solution for matching groups does not work - javascript

Imagine a text like in this example:
some unimportant content
some unimportant content [["string1",1,2,5,"string2"]] some unimportant content
some unimportant content
I need a REGEX pattern which will match the parts in [[ ]] and I need to match each part individually separated by commas.
I already tried
const regex = /\[\[(([^,]*),?)*\]\]/g
const found = result.match(regex)
but it doesn't work as expected. It matches only the full string and have no group matches. Also it has a catastrophic backtracking according to regex101.com if the sample text is larger.
Output should be a JS array ["string1", 1, 2, 5, "string2"]
Thank you for your suggestions.

What about going with a simple pattern like /\[\[(.*)\]\]/g and then you'd just have to split the result (and apparently strip those extra quotation marks):
const result = `some unimportant content
some unimportant content [["string1",1,2,5,"string2"]] some unimportant content
some unimportant content`;
// const found = /\[\[(.*)\]\]/g.exec(result);
const found = /\[\[(.*?)\]\]/g.exec(result); // As suggested by MikeM
const arr_from_found = found[1].replace(/\"/g, '').split(',');
console.log(arr_from_found); // [ 'string1', '1', '2', '5', 'string2' ]

Try replace method.
let cleantext = result.replace("[", "")
then
let more_cleantext = cleantext.replace("]", "")
but if your result variable is array then just
result[0]

Related

How do i use a Regex separator that looks like a named group?

I am splitting a string that contains substrings of this form:
"<at>foo bar</at>"
using this construct:
tokens = command.trim().split( /,\s+|,|\s+|(?=<at>)|(?=<\/at>)/ )
However, the result is an array:
["<at>foo", "bar", "</at>"]
How do I modify the regex to produce?:
["<at>", "foo", "bar", "</at>"]
Thanks in advance.

Instead of using split, you might match the parts
<\/?at>|[^<>\s]+
Regex demo
const regex = /<\/?at>|[^<>\s]+/g;
console.log(`<at>foo bar</at>`.match(regex));
Using split, the pattern could be
,\s*|\s+|(?<=<at>)|(?=<\/at>)
const regex = /,\s*|\s+|(?<=<at>)|(?=<\/at>)/g;
console.log(`<at>foo bar</at>`.split(regex))

Nice answer by #The fourth bird
Another way could be splitting only the string between the tags:
const chars = "<at>foo bar</at>";
const tags = /<\/?at>/g;
const tokens = chars.replace(tags, "").split(/\s/);
const result = ["<at>", ...tokens, "</at>"]
console.log(result)

regex exclude matches that don't meet one of two patterns separated by delimiter

In Javascript using string.match():
I have a string like: foo_2:asc,foo2:desc,foo3,foo4:wrong
the matches should look like ["foo_2:asc", "foo2:desc", "foo3"]
but instead the best I can get it to so far is a match returning ["foo_2:asc", "foo2:desc", "foo3", "wrong"]
the regex that I'm using currently for the above wrong match is: /([a-z0-9_]+?[:asc|:desc]*?)(?=,|$)/gi
I also need a regex that will return the opposite, i.e. find a match for all patterns between the delimiter that doesn't match the pattern rules of thing_1:asc, thing_1:desc, or thing_1 i.e. this would be used to validate the string, while the other would be used to gather the values (i.e. instead of splitting the string manually). So the result of the original would be ["foo4:wrong"] as the part of that string that doesn't meet the pattern.

Assuming that the only valid forms are words followed by one of :asc, :desc or nothing, you can do what you want by splitting the string, first on , and then on : and checking whether there are two values as a result of the last split and the second is not one of asc or desc:
const str = 'foo_2:asc,foo2:desc,foo3,foo4:wrong';
const errs = str.split(',').filter(v => v.split(':').length == 2 && ['asc', 'desc'].indexOf(v.split(':')[1]) == -1);
console.log(errs);
If you must use regex, you can split on , and then filter based on the value not matching ^\w+(:(asc|desc))$:
const str = 'foo_2:asc,foo2:desc,foo3,foo4:wrong';
const errs = str.split(',').filter(v => !v.match(/^\w+(:(?:asc|desc))?$/));
console.log(errs);
If the format of the string is guaranteed to be \w+(:\w+)?(,\w+(:\w+)?)* you can simplify to this:
const str = 'foo_2:asc,foo2:desc,foo3,foo4:wrong';
const errs = str.match(/\w+:(?!(?:asc|desc)\b)\w+/g);
console.log(errs);

If you'd like regex for this purpose, you probably can just add start from coma or string start.
/(^|\,)([a-z0-9_]+?(:asc|:desc)*?)(?=,|$)/gi
also pay attention [:asc|:desc] changed to (:asc|:desc), to avoid false positive cases like:
foo5:aaa,foo6:d,foo7:,foo8|,et:c
it just matches by any char in square brackets.
Regarding opposite, try something like:
/(^|\,)(?!([a-z0-9_]+?(:asc|:desc)*?)(?=,|$))[^,$]+/gi
seems to do the job.

For the match I came up with
/(?<=(^|,))((\w+(?!:)|\w+(:asc|:desc)))(?=($|,))/g
Example: https://regex101.com/r/QLJeDV/3/
> "foo_2:asc,foo2:desc,foo3,foo4:wrong".match(/(?<=(^|,))((\w+(?!:)|\w+(:asc|:desc)))(?=($|,))/g)
[ 'foo_2:asc', 'foo2:desc', 'foo3' ]
Or even
/(?<=(^|,))\w+(:asc|:desc)?(?=($|,))/g
should work. Example: https://regex101.com/r/QLJeDV/6/
> "foo_2:asc,foo2:desc,foo3,foo4:wrong".match(/(?<=(^|,))\w+(:asc|:desc)?(?=($|,))/g)
[ 'foo_2:asc', 'foo2:desc', 'foo3' ]
They are using lookahead and lookbehind.
For the "opposite", I don't know how to match something and then "negate" a later pattern, but only know how to negate the result of whether it is a complete match, so I had to split it. The "opposite":
> "foo_2:asc,foo2:desc,foo3,foo4:wrong".split(",").filter(s => !/^((\w+(?!:)|\w+(:asc|:desc)))$/.test(s))
[ 'foo4:wrong' ]
and the "original":
> "foo_2:asc,foo2:desc,foo3,foo4:wrong".split(",").filter(s => /^((\w+(?!:)|\w+(:asc|:desc)))$/.test(s))
[ 'foo_2:asc', 'foo2:desc', 'foo3' ]
Or it can be simplified as:
> "foo_2:asc,foo2:desc,foo3,foo4:wrong".split(",").filter(s => !/^\w+(:asc|:desc)?$/.test(s))
[ 'foo4:wrong' ]
> "foo_2:asc,foo2:desc,foo3,foo4:wrong".split(",").filter(s => /^\w+(:asc|:desc)?$/.test(s))
[ 'foo_2:asc', 'foo2:desc', 'foo3' ]

Replace regular expression matches with array of values

I have a regular expression to find text with ??? inside a string.
const paragraph = 'This ??? is ??? and ???. Have you seen the ????';
const regex = /(\?\?\?)/g;
const found = paragraph.match(regex);
console.log(found);
Is there a way to replace every match with a value from an array in order?
E.g. with an array of ['cat', 'cool', 'happy', 'dog'], I want the result to be 'This cat is cool and happy. Have you seen the dog?'.
I saw String.prototype.replace() but that will replace every value.

Use a replacer function that shifts from the array of replacement strings (shift removes and returns the item at the 0th index):
const paragraph = 'This ??? is ??? and ???. Have you seen the ????';
const regex = /(\?\?\?)/g;
const replacements = ['cat', 'cool', 'happy', 'dog'];
const found = paragraph.replace(regex, () => replacements.shift());
console.log(found);
(if there are not enough items in the array to replace all, the rest of the ???s will be replaced by undefined)

JavaScript Regex - Splitting a string into an array by the Regex pattern

Given an input field, I'm trying to use a regex to find all the URLs in the text fields and make them links. I want all the information to be retained, however.
So for example, I have an input of "http://google.com hello this is my content" -> I want to split that by the white space AFTER this regex pattern from another stack overflow question (regexp = /(ftp|http|https)://(\w+:{0,1}\w*#)?(\S+)(:[0-9]+)?(/|/([\w#!:.?+=&%#!-/]))?/) so that I end up with an array of ['http://google.com', 'hello this is my content'].
Another ex: "hello this is my content http://yahoo.com testing testing http://google.com" -> arr of ['hello this is my content', 'http://yahoo.com', 'testing testing', 'http://google.com']
How can this be done? Any help is much appreciated!

First transform all the groups in your regular expression into non-capturing groups ((?:...)) and then wrap the whole regular expression inside a group, then use it to split the string like this:
var regex = /((?:ftp|http|https):\/\/(?:\w+:{0,1}\w*#)?(?:\S+)(?::[0-9]+)?(?:\/|\/(?:[\w#!:.?+=&%#!-/]))?)/;
var result = str.split(regex);
Example:
var str = "hello this is my content http://yahoo.com testing testing http://google.com";
var regex = /((?:ftp|http|https):\/\/(?:\w+:{0,1}\w*#)?(?:\S+)(?::[0-9]+)?(?:\/|\/(?:[\w#!:.?+=&%#!-/]))?)/;
var result = str.split(regex);
console.log(result);

You had few unescaped backslashes in your RegExp.
var str = "hello this is my content http://yahoo.com testing testing http://google.com";
var captured = str.match(/(ftp|http|https):\/\/(\w+:{0,1}\w*#)?(\S+)(:[0-9]+)?(\/|\/([\w#!:.?+=&%#!-/]))?/g);
var nonCaptured = [];
str.split(' ').map((v,i) => captured.indexOf(v) == -1 ? nonCaptured.push(v) : null);
console.log(nonCaptured, captured);

How to get parameterise a string separated by commas with regex?

I have a string that looks like this:
{{tagName(21, 'hello, jane','smith')}}
I'm trying to use regex to match() this string to result in:
[0] = tagName
[1] = 21
[2] = 'hello, jane'
[3] = 'smith'
The parameter part of the string can grow. That is to say, it may have more or less parameters and the regex needs to be "greedy" yet knows how to group them up.
I've been trying something like that: ^\{\{([^\(]+)\({1}(.*)\){1}\}\}
But this results in:
[0] = tagName
[1] = 21, 'hello, jane','smith'
What should I do to my regex to get the results I want?

Replace {, }, (, ) with empty string; match [a-z]+, \d+, '.+' followed by , or end of input
var str = "{{tagName(21, 'hello, jane','smith')}}";
var res = str.replace(/\{|\}|\(|\)/g, "")
.match(/([a-z]+)|\d+|('.+')(?=,)|('.+')(?=$)/ig);
console.log(res);

If you're ok with using two regexes, you can return a new array with function name and concat all the parameters onto that array.
With ES6 you can use the spread operator:
const str = "{{tagName(21, 'hello, jane','smith')}}";
const result = str.match(/^\{\{([^\(]+)\({1}(.*)\){1}\}\}/);
console.log([
result[1],
...result[2].match(/^\d+|'.*?'/g)
])
In ES5 you'll have to concat the parameters onto the array containing the function name as its first item:
var str = "{{tagName(21, 'hello, jane','smith')}}";
var result = str.match(/^\{\{([^\(]+)\({1}(.*)\){1}\}\}/);
console.log([result[1]].concat(result[2].match(/^\d+|'.*?'/g)))
In reality, you could concat in ES6, but

So far I've manage to come up with the following:
([A-Za-z]+)|('.*?'|[^',\s\(]+)(?=\s*,|\s*\))
Tested on https://regex101.com

Develop Reference

JavaScript is the programming language of the Web.

Regex solution for matching groups does not work - javascript

Try replace method. let cleantext = result.replace("[", "") then let more_cleantext = cleantext.replace("]", "") but if your result variable is array then just result[0]

Related

How do i use a Regex separator that looks like a named group?

regex exclude matches that don't meet one of two patterns separated by delimiter

Replace regular expression matches with array of values

JavaScript Regex - Splitting a string into an array by the Regex pattern

How to get parameterise a string separated by commas with regex?

Categories

Resources