How to read all string inside parentheses using regex - javascript

I wanted to get all strings inside a parentheses pair. for example, after applying regex on
"fun('xyz'); fun('abcd'); fun('abcd.ef') { temp('no'); "
output should be
['xyz','abcd', 'abcd.ef'].
I tried many option but was not able to get desired result.
one option is
/fun\((.*?)\)/gi.exec("fun('xyz'); fun('abcd'); fun('abcd.ef')").

Store the regex in a variable, and run it in a loop...
var re = /fun\((.*?)\)/gi,
string = "fun('xyz'); fun('abcd'); fun('abcd.ef')",
matches = [],
match;
while(match = re.exec(string))
matches.push(match[1]);
Note that this only works for global regex. If you omit the g, you'll have an infinite loop.
Also note that it'll give an undesired result if there a ) between the quotation marks.

You can use this code will almost do the job:
"fun('xyz'); fun('abcd'); fun('abcd.ef')".match(/'.*?'/gi);
You'll get ["'xyz'", "'abcd'", "'abcd.ef'"] which contains extra ' around the string.

The easiest way to find what you need is to use this RegExp: /[\w.]+(?=')/g
var string = "fun('xyz'); fun('abcd'); fun('abcd.ef')";
string.match(/[\w.]+(?=')/g); // ['xyz','abcd', 'abcd.ef']
It will work with alphanumeric characters and point, you will need to change [\w.]+ to add more symbols.

Related

Get last occurrence using RegEx

I have a huge string with this inside:
linha[0] = '12/2010 281R$ 272.139,05 ';
linha[0] = '13SL 1R$ 226.185,81 ';
Both lines are separate, and I need get the last occurrence from both. I'm using the following regex to match the first one:
/linha\[0]\s=\s'(.*)';/
I would like to get the second "linha..." too, but I don't know exactly how.
That's how i'm using this regex to get the first "linha...":
string.match(/linha\[0]\s=\s'(.*)';/);
output:
linha[0] = '12/2010 281R$ 272.139,05 ';
Also, i can't do extra work, i need get the second occurrence using only regex.
If you want to get the last occurrence of your regex in a string (and assuming it exists), you can do
var str = hugeString.match(/linha\[0]\s=\s'([^']*)';/g).pop();
(yes, I changed .* to [^']* for a better efficiency, ignore that if you have quotes in your inner string)
Now, if you want to extract just the submatch, you can do
var regex = /linha\[0]\s=\s'([^']*)';/g,
arr,
str;
while (arr = regex.exec(hugeString)) str = arr[1];

How to split a string by a character not directly preceded by a character of the same type?

Let's say I have a string: "We.need..to...split.asap". What I would like to do is to split the string by the delimiter ., but I only wish to split by the first . and include any recurring .s in the succeeding token.
Expected output:
["We", "need", ".to", "..split", "asap"]
In other languages, I know that this is possible with a look-behind /(?<!\.)\./ but Javascript unfortunately does not support such a feature.
I am curious to see your answers to this question. Perhaps there is a clever use of look-aheads that presently evades me?
I was considering reversing the string, then re-reversing the tokens, but that seems like too much work for what I am after... plus controversy: How do you reverse a string in place in JavaScript?
Thanks for the help!
Here's a variation of the answer by guest271314 that handles more than two consecutive delimiters:
var text = "We.need.to...split.asap";
var re = /(\.*[^.]+)\./;
var items = text.split(re).filter(function(val) { return val.length > 0; });
It uses the detail that if the split expression includes a capture group, the captured items are included in the returned array. These capture groups are actually the only thing we are interested in; the tokens are all empty strings, which we filter out.
EDIT: Unfortunately there's perhaps one slight bug with this. If the text to be split starts with a delimiter, that will be included in the first token. If that's an issue, it can be remedied with:
var re = /(?:^|(\.*[^.]+))\./;
var items = text.split(re).filter(function(val) { return !!val; });
(I think this regex is ugly and would welcome an improvement.)
You can do this without any lookaheads:
var subject = "We.need.to....split.asap";
var regex = /\.?(\.*[^.]+)/g;
var matches, output = [];
while(matches = regex.exec(subject)) {
output.push(matches[1]);
}
document.write(JSON.stringify(output));
It seemed like it'd work in one line, as it did on https://regex101.com/r/cO1dP3/1, but had to be expanded in the code above because the /g option by default prevents capturing groups from returning with .match (i.e. the correct data was in the capturing groups, but we couldn't immediately access them without doing the above).
See: JavaScript Regex Global Match Groups
An alternative solution with the original one liner (plus one line) is:
document.write(JSON.stringify(
"We.need.to....split.asap".match(/\.?(\.*[^.]+)/g)
.map(function(s) { return s.replace(/^\./, ''); })
));
Take your pick!
Note: This answer can't handle more than 2 consecutive delimiters, since it was written according to the example in the revision 1 of the question, which was not very clear about such cases.
var text = "We.need.to..split.asap";
// split "." if followed by "."
var res = text.split(/\.(?=\.)/).map(function(val, key) {
// if `val[0]` does not begin with "." split "."
// else split "." if not followed by "."
return val[0] !== "." ? val.split(/\./) : val.split(/\.(?!.*\.)/)
});
// concat arrays `res[0]` , `res[1]`
res = res[0].concat(res[1]);
document.write(JSON.stringify(res));

JS / RegEx to remove characters grouped within square braces

I hope I can explain myself clearly here and that this is not too much of a specific issue.
I am working on some javascript that needs to take a string, find instances of chars between square brackets, store any returned results and then remove them from the original string.
My code so far is as follows:
parseLine : function(raw)
{
var arr = [];
var regex = /\[(.*?)]/g;
var arr;
while((arr = regex.exec(raw)) !== null)
{
console.log(" ", arr);
arr.push(arr[1]);
raw = raw.replace(/\[(.*?)]/, "");
console.log(" ", raw);
}
return {results:arr, text:raw};
}
This seems to work in most cases. If I pass in the string [id1]It [someChar]found [a#]an [id2]excellent [aa]match then it returns all the chars from within the square brackets and the original string with the bracketed groups removed.
The problem arises when I use the string [id1]It [someChar]found [a#]a [aa]match.
It seems to fail when only a single letter (and space?) follows a bracketed group and starts missing groups as you can see in the log if you try it out. It also freaks out if i use groups back to back like [a][b] which I will need to do.
I'm guessing this is my RegEx - begged and borrowed from various posts here as I know nothing about it really - but I've had no luck fixing it and could use some help if anyone has any to offer. A fix would be great but more than that an explanation of what is actually going on behind the scenes would be awesome.
Thanks in advance all.
You could use the replace method with a function to simplify the code and run the regexp only once:
function parseLine(raw) {
var results = [];
var parsed = raw.replace(/\[(.*?)\]/g, function(match,capture) {
results.push(capture);
return '';
});
return { results : results, text : parsed };
}
The problem is due to the lastIndex property of the regex /\[(.*?)]/g; not resetting, since the regex is declared as global. When the regex has global flag g on, lastIndex property of RegExp is used to mark the position to start the next attempt to search for a match, and it is expected that the same string is fed to the RegExp.exec() function (explicitly, or implicitly via RegExp.test() for example) until no more match can be found. Either that, or you reset the lastIndex to 0 before feeding in a new input.
Since your code is reassigning the variable raw on every loop, you are using the wrong lastIndex to attempt the next match.
The problem will be solved when you remove g flag from your regex. Or you could use the solution proposed by Tibos where you supply a function to String.replace() function to do replacement and extract the capturing group at the same time.
You need to escape the last bracket: \[(.*?)\].

RegExp match a single quoted text without quotes - JavaScript

I'm sorry if it is a confusing question. I was trying to find a way to do this but couldn't find it so, if it is a repeated question, my apologies!
I have a text something like this: something:"askjnqwe234"
I want to be able to get askjnqwe234 using a RegExp. You can notice I want to omit the quotes. I was trying this using /[^"]+(?=(" ")|"$)/g but it returns an array. I want a RegExt to return a single string, not an array.
I don't know if it's possible but I do not want to specify the position of the array; something like this:
var x = string.match(/[^"]+(?=(" ")|"$)/g)[0];
Thanks!
Try:
/"([^"]*)"/g
in English: look for " the match and record anything that isn't " till you see another "".
match and exec always return an array or null, so, assuming you have a single double-quoted value and no newlines in the string, you could use
var x;
var str = 'something:"askjnqwe234"';
x = str.replace( /^[^"]*"|".*/g, '' );
// "askjnqwe234"
Or, if you may have other quoted values in the string
x = str.replace( /.*?something:"([^"]*)".*/, '$1' );
where $1 refers to the substring captured by the sub-pattern [^"]* between the ().
Further explanation on request.
Notwithstanding the above, I recommend that you tolerate the array indexing and just use match.
You can capture the information inside quotes like this, assuming it matches:
var x = string.match(/something:"([^"]*)"/)[1];
The memory capture at index 1 is the part inside the double quotes.
If you're not sure it will match:
var match = string.match(/something:"([^"]*)"/);
if (match) {
// use match[1] here
}

javascript regex to extract the first character after the last specified character

I am trying to extract the first character after the last underscore in a string with an unknown number of '_' in the string but in my case there will always be one, because I added it in another step of the process.
What I tried is this. I also tried the regex by itself to extract from the name, but my result was empty.
var s = "XXXX-XXXX_XX_DigitalF.pdf"
var string = match(/[^_]*$/)[1]
string.charAt(0)
So the final desired result is 'D'. If the RegEx can only get me what is behind the last '_' that is fine because I know I can use the charAt like currently shown. However, if the regex can do the whole thing, even better.
If you know there will always be at least one underscore you can do this:
var s = "XXXX-XXXX_XX_DigitalF.pdf"
var firstCharAfterUnderscore = s.charAt(s.lastIndexOf("_") + 1);
// OR, with regex
var firstCharAfterUnderscore = s.match(/_([^_])[^_]*$/)[1]
With the regex, you can extract just the one letter by using parentheses to capture that part of the match. But I think the .lastIndexOf() version is easier to read.
Either way if there's a possibility of no underscores in the input you'd need to add some additional logic.

Categories

Resources