When using a regex as the separator in the split(), is there a way to know what string it matched?
Example:
var
string = "12+34-12",
numberlist = split(/[^0-9]/);
how would I know if it found a + or a -?
You can use capturing group to also capture string that was used in String#split:
var m = string.split(/(\D)/);
//=> ["12", "+", "34", "-", "12"]
To see the difference here is the output without capturing group:
var m = string.split(/\D/);
//=> ["12", "34", "12"]
PS: I have changed your use of [^0-9] to \D since they are equivalent.
Just capture the splitting regular expression, like
numberlist = string.split(/([^0-9])/);
and the output will be
[ '12', '+', '34', '-', '12' ]
Since you are capturing the splitting regular expression, it will also be a part of the resulting array.
Related
I want to split a string with separators ' or .. WHILE KEEPING them:
"'TEST' .. 'TEST2' ".split(/([' ] ..)/g);
to get:
["'", "TEST", "'", "..", "'", "TEST2", "'" ]
but it doesn't work: do you know how to fix this ?
The [' ] .. pattern matches a ' or space followed with a space and any two chars other than line break chars.
You may use
console.log("'TEST' .. 'TEST2' ".trim().split(/\s*('|\.{2})\s*/).filter(Boolean))
Here,
.trim() - remove leading/trailing whitespace
.split(/\s*('|\.{2})\s*/) - splits string with ' or double dot (that are captured in a capturing group and thus are kept in the resulting array) that are enclosed in 0+ whitespaces
.filter(Boolean) - removes empty items.
I m not sure it will work for every situations, but you can try this :
"'TEST' .. 'TEST2' ".replace(/(\'|\.\.)/g, ' $1 ').trim().split(/\s+/)
return :
["'", "TEST", "'", "..", "'", "TEST2", "'"]
Splitting while keeping the delimiters can often be reduced to a matchAll. In this case, /(?:'|\.\.|\S[^']+)/g seems to do the job on the example. The idea is to alternate between literal single quote characters, two literal periods, or any sequence up to a single quote that starts with a non-space.
const result = [..."'TEST' .. 'TEST2' ".matchAll(/(?:'|\.\.|\S[^']+)/g)].flat();
console.log(result);
Another idea that might be more robust even if it's not a single shot regex is to use a traditional, non-clever "stuff between delimiters" pattern like /'([^']+)'/g, then flatMap to clean up the result array to match your format.
const s = "'TEST' .. 'TEST2' ";
const result = [...s.matchAll(/'([^']+)'/g)].flatMap(e =>
["'", e[1], "'", ".."]
).slice(0, -1);
console.log(result);
I'm trying to capture all parts of a string, but I can't seem to get it right.
The string has this structure: 1+22+33. Numbers with an operator in between. There could be any number of terms.
What I want is ["1+22+33", "1", "+", "22", "+", "33"]
But I get: ["1+22+33", "22", "+", "33"]
I've tried all kinds of regexes, this is the best I've got, but it's obviously wrong.
let re = /(?:(\d+)([+]+))+(\d+)/g;
let s = '1+22+33';
let m;
while (m = re.exec(s))
console.log(m);
Note: the operators may vary. So in reality I'd look for [+/*-].
You can simply use String#split, like this:
const input = '3+8 - 12'; // I've willingly added some random spaces
console.log(input.split(/\s*(\+|-)\s*/)); // Add as many other operators as needed
Just thought of a solution: /(\d+)|([+*/-]+)/g;
You only have to split on digits:
console.log(
"1+22+33".split(/(\d+)/).filter(Boolean)
);
I'm trying to match a currency string that may or may not be suffixed with one of K, M, or Bn, and group them into two parts
Valid matches:
500 K // Expected grouping: ["500", "K"]
900,000 // ["900,000", ""]
2.3 Bn // ["2.3", "Bn"]
800M // ["800", "M"]
ps: I know the matches first item in match output array is the entire match string, the above expected grouping in only an example
The Regex I've got so far is this:
/\b([-\d\,\.]+)\s?([M|Bn|K]?)\b/i
When I match it with a normal string, it does OK.
"898734 K".match(/\b([-\d\,\.]+)\s?([M|Bn|K]?)\b/i)
=> ["898734 K", "898734", "K"] // output
"500,000".match(/\b([-\d\,\.]+)\s?([M|Bn|K]?)\b/i)
=> ["500,000", "500,000", ""]
Trouble is, it also matches space in there
"89 8734 K".match(/\b([-\d\,\.]+)\s?([M|Bn|K]?)\b/i)
=> ["89 ", "89", ""]
And I'm not sure why. So I thought I'd add /g option in there to match entire string, but now it doesn't group the matches.
"898734 K".match(/\b([-\d\,\.]+)\s?([M|Bn|K]?)\b/gi)
=> ["898734 K"]
What change do I need to make to get the regex behave as expected?
You could use a different regular expression, which looks for some numbers, a comma or dot and some other numbers as well, some whitepspace and the wanted letters.
var array = ['500 K', '900,000', '2.3 Bn', '800M'],
regex = /(\d+[.,]?\d*)\s*(K|Bn|M|$)/
array.forEach(function (a) {
var m = a.match(regex);
if (m) {
m.shift();
console.log(m);
}
});
.as-console-wrapper { max-height: 100% !important; top: 0; }
You have a problem and want to use a regex to solve the problem. Now you have two problems...
Joke aside, I think you can achieve what you want to do without any regex:
"".join([c for i, c in enumerate(itertools.takewhile(lambda c: c.isdigit() or c in ',.', s))]), s[i+1:]
I tried this with s="560 K", s="900,000", etc and it seems to work.
How can I extract the text between all pairs of square brackets from the a string "[a][b][c][d][e]", so that I can get the following results:
→ Array: ["a", "b", "c", "d", "e"]
→ String: "abcde"
I have tried the following Regular Expressions, but to no avail:
→ (?<=\[)(.*?)(?=\])
→ \[(.*?)\]
Research:
After having searched in Stack Overflow, I have only found two solutions, both of which using Regular Expressions and they can be found here:
→ (?<=\[)(.*?)(?=\]) (1)
(?<=\[) : Positive Lookbehind.
\[ :matches the character [ literally.
(.*?) : matches any character except newline and expands as needed.
(?=\]) : Positive Lookahead.
\] : matches the character ] literally.
→ \[(.*?)\] (2)
\[ : matches the character [ literally.
(.*?) : matches any character except newline and expands as needed.
\] : matches the character ] literally.
Notes:
(1) This pattern throws an error in JavaScript, because the lookbehind operator is not supported.
Example:
console.log(/(?<=\[)(.*?)(?=\])/.exec("[a][b][c][d][e]"));
Uncaught SyntaxError: Invalid regular expression: /(?<=\[)(.*?)(?=\])/: Invalid group(…)
(2) This pattern returns the text inside only the first pair of square brackets as the second element.
Example:
console.log(/\[(.*?)\]/.exec("[a][b][c][d][e]"));
Returns: ["[a]", "a"]
Solution:
The most precise solution for JavaScript that I have come up with is:
var string, array;
string = "[a][b][c][d][e]";
array = string.split("["); // → ["", "a]", "b]", "c]", "d]", "e]"]
string = array1.join(""); // → "a]b]c]d]e]"
array = string.split("]"); // → ["a", "b", "c", "d", "e", ""]
Now, depending upon whether we want the end result to be an array or a string we can do:
array = array.slice(0, array.length - 1) // → ["a", "b", "c", "d", "e"]
/* OR */
string = array.join("") // → "abcde"
One liner:
Finally, here's a handy one liner for each scenario for people like me who prefer to achieve the most with least code or our TL;DR guys.
Array:
var a = "[a][b][c][d][e]".split("[").join("").split("]").slice(0,-1);
/* OR */
var a = "[a][b][c][d][e]".slice(1,-1).split(']['); // Thanks #xorspark
String:
var a = "[a][b][c][d][e]".split("[").join("").split("]").join("");
I don't know what text you are expecting in that string of array, but for the example you've given.
var arrStr = "[a][b][c][d][e]";
var arr = arrStr.match(/[a-z]/g) --> [ 'a', 'b', 'c', 'd', 'e' ] with typeof 'array'
then you can just use `.concat()` on the produced array to combine them into a string.
if you're expecting multiple characters between the square brackets, then the regex can be (/[a-z]+/g) or tweaked to your liking.
I think this approach will be interesting to you.
var arr = [];
var str = '';
var input = "[a][b][c][d][e]";
input.replace(/\[(.*?)\]/g, function(match, pattern){
arr.push(pattern);
str += pattern;
return match;//just in case ;)
});
console.log('Arr:', arr);
console.log('String:', str);
//And trivial solution if you need only string
var a = input.replace(/\[|\]/g, '');
console.log('Trivial:',a);
I have a string:
'"Apples" AND "Bananas" OR "Gala Melon"'
I would like to convert this to an array
arr = ['"Apples"', 'AND', '"Bananas"', 'OR', '"Gala Melon"']
I don't know if I can do it with a regular expression. I'm beginning to think I may have to parse each character at a time to match the double quotes.
input = '"Apples" AND "Bananas" OR "Gala Melon"'
output = input.match(/\w+|"[^"]+"/g)
// output = ['"Apples"', 'AND', '"Bananas"', 'OR', '"Gala Melon"']
Explanation for the regex:
/ - start of regex
\w+ - sequence of word characters
| - or
"[^"]+" - anything quoted (assuming no escaped quotes)
/g - end of regex, global flag (perform multiple matches)