JavaScript regex split consecutive string (vs Java) - javascript

I have this splitting regex to group consecutive words in String
/(?<=(.))(?!\1)/g
So in Java the above regex would split the string as I'm expected like this
"aaabbbccccdd".split("(?<=(.))(?!\\1)");
// return [aaa, bbb, cccc, dd]
but in JS the same regex will split like this, included my capture group 1
'aaabbbccccdd'.split(/(?<=(.))(?!\1)/g);
/// return ['aaa', 'a', 'bbb', 'b', 'cccc', 'c', 'dd']
So is there anyway to avoid capture group in result in JS (I've tried (?:...) but then I can't use \1 anymore)
And if you have other way or improvement to split consecutive words in JS (using or not using regex), please share it, thank you.

I would keep your current logic but just filter off the even index list elements:
var input = "aaabbbccccdd";
var matches = input.split(/(?<=(.))(?!\1)/)
.filter(function(d, i) { return (i+1) % 2 != 0; });
console.log(matches);

Related

find all letters without repeating regex

help me please to find all letters in string without repeating using regex JS.
Examples:
let str = "abczacg";
str = str.match(/ pattern /); // return has to be: abczg
str = "aabbccdd" //return:abcd.
str = "hello world"//return: helo wrd
Is it possible?
Thank you!
Here is one approach. We can first reverse the input string. Then, do a global regex replacement on the following pattern:
(\w)(?=.*\1)
This will strip off any character for which we can find the same character later in the string. But, as we will be running this replacement on the reversed string, this has the actual effect of removing all duplicate letters other than their first occurrence. Finally, we reverse the remaining string again to arrive at the expected output.
var input = "abczacg";
var output = input.split("").reverse().join("").replace(/(\w)(?=.*\1)/g, "");
output = output.split("").reverse().join("");
console.log(output);
An alternative without using a regex using a Set:
[
"abczacg",
"aabbccdd",
"hello world"
].forEach(s => {
console.log([...new Set(s.split(''))].join(''))
})

Capture group certain number of times with regular expression but last group has remaining values

Given a string delimited by a colon and similar to this...
xvf:metric:admin:click
I need to capture three groups...
xvf
metric
admin:click
Or another example:
one:two:three:four:five:six:seven
one
two
three:four:five:six:seven
My current regex is just capturing each word separately, resulting in 4 matches
/(\s*\w+)/gi
The solution using String.match and Array.slice functions:
var str = "one:two:three:four:five:six:seven",
groups = str.match(/([^:]+?):([^:]+?):(.+)?$/).slice(1);
console.log(groups); // ["one", "two", "three:four:five:six:seven"]
And if it's possible that you get less than 3 groups, you can use
/^([^:]+)(?::([^:]+)(?::(.+)?)?)?$/
You can find an "explanation" of the RegExp here.
The solution is to capture the first two things before a :, then capture everything after
Here's the regex:
/(.+?):(.+?):(.+)/
In code:
var testStr = "xvf:metric:admin:click";
console.log(/(.+?):(.+?):(.+)/.exec(testStr).slice(1,4))
//["xvf", "metric", "admin:click"]
Since you're using JavaScript, it'd make more sense to actually use string.split and later Array.slice and Array.splice for string manipulation:
var str = "one:two:three:four:five:six:seven",
groups = str.split(':');
groups.splice(2, groups.length, groups.slice(2).join(':'));
console.log(groups);

Remove comma from group regex

Is it possible from this:
US Patent 6,570,557
retrieve 3 groups being:
US
Patent
6570557 (without the commas)
So far I got:
(US)(\s{1}Patent\s{1})(\d{1},\d{3},\d{3})
and was trying (?!,) to get rid of the commas then I effectively get rid of the whole number.
Try with:
var input = 'US Patent 6,570,557',
matches = input.match(/^(\w+) (\w+) ([\d,]+)/),
code = matches[1],
name = matches[2],
numb = matches[3].replace(/,/g,'');
Instead of using regex, you can do it with 2 simple functions:
var str = "US Patent 6,570,557"; // Your input
var array = str.split(" "); // Separating each word
array[2] = array[2].replace(",", ""); // Removing commas
return array; // The output
This should be faster too.
You cannot ignore the commas when matching, unless you match the number as three separate parts and then join them together.
It would be much preferable to strip the delimiters from the number from the matching results with String.replace.
Just add more groups like so:
(US)(\s{1}Patent\s{1})(\d{1}),(\d{3}),(\d{3})
And then concatenate the last 3 groups

Javascript Regex to split a string into array of grouped/contiguous characters

I'm trying to do the same thing that this guy is doing, only he's doing it in Ruby and I'm trying to do it via Javascript:
Split a string into an array based on runs of contiguous characters
It's basically just splitting a single string of characters into an array of contiguous characters - so for example:
Given input string of
'aaaabbbbczzxxxhhnnppp'
would become an array of
['aaaa', 'bbbb', 'c', 'zz', 'xxx', 'hh', 'nn', 'ppp']
The closest I've gotten is:
var matches = 'aaaabbbbczzxxxhhnnppp'.split(/((.)\2*)/g);
for (var i = 1; i+3 <= matches.length; i += 3) {
alert(matches[i]);
}
Which actually does kinda/sorta work... but not really.. I'm obviously splitting too much or else I wouldn't have to eliminate bogus entries with the +3 index manipulation.
How can I get a clean array with only what I want in it?
Thanks-
Your regex is fine, you're just using the wrong function. Use String.match, not String.split:
var matches = 'aaaabbbbczzxxxhhnnppp'.match(/((.)\2*)/g);

getting contents of string between digits

have a regex problem :(
what i would like to do is to find out the contents between two or more numbers.
var string = "90+*-+80-+/*70"
im trying to edit the symbols in between so it only shows up the last symbol and not the ones before it. so trying to get the above variable to be turned into 90+80*70. although this is just an example i have no idea how to do this. the length of the numbers, how many "sets" of numbers and the length of the symbols in between could be anything.
many thanks,
Steve,
The trick is in matching '90+-+' and '80-+/' seperately, and selecting only the number and the last constant.
The expression for finding the a number followed by 1 or more non-numbers would be
\d+[^\d]+
To select the number and the last non-number, add parens:
(\d+)[^\d]*([^\d])
Finally add a /g to repeat the procedure for each match, and replace it with the 2 matched groups for each match:
js> '90+*-+80-+/*70'.replace(/(\d+)[^\d]*([^\d])/g, '$1$2');
90+80*70
js>
Or you can use lookahead assertion and simply remove all non-numerical characters which are not last: "90+*-+80-+/*70".replace(/[^0-9]+(?=[^0-9])/g,'');
You can use a regular expression to match the non-digits and a callback function to process the match and decide what to replace:
var test = "90+*-+80-+/*70";
var out = test.replace(/[^\d]+/g, function(str) {
return(str.substr(-1));
})
alert(out);
See it work here: http://jsfiddle.net/jfriend00/Tncya/
This works by using a regular expression to match sequences of non-digits and then replacing that sequence of non-digits with the last character in the matched sequence.
i would use this tutorial, first, then review this for javascript-specific regex questions.
This should do it -
var string = "90+*-+80-+/*70"
var result = '';
var arr = string.split(/(\d+)/)
for (i = 0; i < arr.length; i++) {
if (!isNaN(arr[i])) result = result + arr[i];
else result = result + arr[i].slice(arr[i].length - 1, arr[i].length);
}
alert(result);
Working demo - http://jsfiddle.net/ipr101/SA2pR/
Similar to #Arnout Engelen
var string = "90+*-+80-+/*70";
string = string.replace(/(\d+)[^\d]*([^\d])(?=\d+)/g, '$1$2');
This was my first thinking of how the RegEx should perform, it also looks ahead to make sure the non-digit pattern is followed by another digit, which is what the question asked for (between two numbers)
Similar to #jfriend00
var string = "90+*-+80-+/*70";
string = string.replace( /(\d+?)([^\d]+?)(?=\d+)/g
, function(){
return arguments[1] + arguments[2].substr(-1);
});
Instead of only matching on non-digits, it matches on non-digits between two numbers, which is what the question asked
Why would this be any better?
If your equation was embedded in a paragraph or string of text. Like:
This is a test where I want to clean up something like 90+*-+80-+/*70 and don't want to scrap the whole paragraph.
Result (Expected) :
This is a test where I want to clean up something like 90+80*70 and don't want to scrap the whole paragraph.
Why would this not be any better?
There is more pattern matching, which makes it theoretically slower (negligible)
It would fail if your paragraph had embedded numbers. Like:
This is a paragraph where Sally bought 4 eggs from the supermarket, but only 3 of them made it back in one piece.
Result (Unexpected):
This is a paragraph where Sally bought 4 3 of them made it back in one piece.

Categories

Resources