Javascript Regex to split a string into array of grouped/contiguous characters - javascript

I'm trying to do the same thing that this guy is doing, only he's doing it in Ruby and I'm trying to do it via Javascript:
Split a string into an array based on runs of contiguous characters
It's basically just splitting a single string of characters into an array of contiguous characters - so for example:
Given input string of
'aaaabbbbczzxxxhhnnppp'
would become an array of
['aaaa', 'bbbb', 'c', 'zz', 'xxx', 'hh', 'nn', 'ppp']
The closest I've gotten is:
var matches = 'aaaabbbbczzxxxhhnnppp'.split(/((.)\2*)/g);
for (var i = 1; i+3 <= matches.length; i += 3) {
alert(matches[i]);
}
Which actually does kinda/sorta work... but not really.. I'm obviously splitting too much or else I wouldn't have to eliminate bogus entries with the +3 index manipulation.
How can I get a clean array with only what I want in it?
Thanks-

Your regex is fine, you're just using the wrong function. Use String.match, not String.split:
var matches = 'aaaabbbbczzxxxhhnnppp'.match(/((.)\2*)/g);

Related

JavaScript regex split consecutive string (vs Java)

I have this splitting regex to group consecutive words in String
/(?<=(.))(?!\1)/g
So in Java the above regex would split the string as I'm expected like this
"aaabbbccccdd".split("(?<=(.))(?!\\1)");
// return [aaa, bbb, cccc, dd]
but in JS the same regex will split like this, included my capture group 1
'aaabbbccccdd'.split(/(?<=(.))(?!\1)/g);
/// return ['aaa', 'a', 'bbb', 'b', 'cccc', 'c', 'dd']
So is there anyway to avoid capture group in result in JS (I've tried (?:...) but then I can't use \1 anymore)
And if you have other way or improvement to split consecutive words in JS (using or not using regex), please share it, thank you.
I would keep your current logic but just filter off the even index list elements:
var input = "aaabbbccccdd";
var matches = input.split(/(?<=(.))(?!\1)/)
.filter(function(d, i) { return (i+1) % 2 != 0; });
console.log(matches);

Getting strings between two specific occurrences of specific characters in JS

I am working on the following code. How can I extract/get strings between to specific numbers of characters in an string like
lorem1-lorem9-lorem3-lorem8-lorem1-lorem11-one-two-three-lorem22-lorem55.png?
What I need is:
one-two-three
I am able to remove things after the 9 occurrence of the - but not sure how to remove things before the 6 occurrence of - as well
var str = "lorem1-lorem9-lorem3-lorem8-lorem1-lorem11-one-two-three-lorem22-lorem55.png"
console.log(str.split("-", 9).join("-"));
Array.prototype.splice can be used to split an array.
var str = "lorem1-lorem9-lorem3-lorem8-lorem1-lorem11-one-two-three-lorem22-lorem55.png"
let out = str.split("-", 9).splice(6).join("-")
console.log(out);

Extract Twitter handlers from string using regex in JavaScript

I Would like to extract the Twitter handler names from a text string, using a regex. I believe I am almost there, except for the ">" that I am including in my output. How can I change my regex to be better, and drop the ">" from my output?
Here is an example of a text string value:
"PlaymakersZA, Absa, DiepslootMTB"
The desired output would be an array consisting of the following:
PlaymakersZA, Absa, DiepslootMTB
Here is an example of my regex:
var array = str.match(/>[a-z-_]+/ig)
Thank you!
You can use match groups in your regex to indicate the part you wish to extract.
I set up this JSFiddle to demonstrate.
Basically, you surround the part of the regex that you want to extract in parenthesis: />([a-z-_]+)/ig, save it as an object, and execute .exec() as long as there are still values. Using index 1 from the resulting array, you can find the first match group's result. Index 0 is the whole regex, and next indices would be subsequent match groups, if available.
var str = "PlaymakersZA, Absa, DiepslootMTB";
var regex = />([a-z-_]+)/ig
var array = regex.exec(str);
while (array != null) {
alert(array[1]);
array = regex.exec(str);
}
You could just strip all the HTML
var str = "PlaymakersZA, Absa, DiepslootMTB";
$handlers = str.replace(/<[^>]*>|\s/g,'').split(",");

Variables incorporating dynamic numbers

I'm trying to make hangman for a Grade 12 Assessment piece.
I need to create variables due to the length of the current word chosen to be guessed. For example, if the word is 'cat', then
letter0 = 'c'
letter1 = 'a'
letter2 = 't'
So far, I have gotten some progress with a for loop.
for (i = 0; i <= currentWord.length){
//var letter(i) = currentWord.charAt(i)
i++
}
The commented out line was what I was aiming for, where the letter position would be put into the variable. Obviously this doesn't work, as the variable is just straight up read as letter(i) instead of letter(possible number). The loop would then stop once the length had been reached, and therefore there would be a unique variable for each letter of currentWord.
Any ideas of how to make this work?
Thanks!
if you want to convert string to character array use currentWord.split("")
it return array containing each character as element.
If looping is your goal, you don't even need to split, this works as expected:
const word = "cat";
for (let i = 0; i < word.length; i++) {
console.log(word[i]); // Will output "c" then "a" then "t"
}
In most programming languages, strings are just arrays of letters anyway.
If you really want an array (there are things you can do to arrays and not to strings), you can use word.split("") which returns an array of letters.
I'd suggest to split your word into an array, like Hacketo suggested:
var letters_array = "cat".split('');
Then you can loop over that array (check out the answers for Loop through an array in JavaScript)

What are elegant ways to pair characters in a string?

For example, if the initial string s is "0123456789", desired output would be an array ["01", "23", "45", "67", "89"].
Looking for elegant solutions in JavaScript.
What I was thinking (very non-elegantly) is to iterate through the string by splitting on the empty string and using the Array.forEach method, and insert a delimeter after every two characters, then split by that delimeter. This is not a good solution, but it's my starting point.
Edit: A RegExp solution has been posted. I'd love to see if there are any other approaches.
How about:
var array = ("0123456789").match(/\w{1,2}/g);
Here we use .match() on your string to match any two or single ({1,2}) word characters (\w) and return an array of the results.
Regarding your edit for a non-regex solution; you could do a far less elegant function like this:
String.prototype.getPairs = function()
{
var pairs = [];
for(var i = 0; i < this.length; i += 2)
{
pairs[pairs.length] = this.substr(i, 2);
}
return pairs;
}
var array = ("01234567890").getPairs();
If you want to use split (and why not), you could do the following:
s.split(/([^][^])/).filter(function(x){return x})
Which splits using two consecutive characters as a delimiter (but because they're in a capture group, they're also part of split's result. Filtering that with the identity function serves to eliminate the empty strings (between the "delimiters"). Note that in the case of an odd number of characters, the last character will be output as a split, not a delimiter, but it doesn't matter since it will still test truthy.
([^] is how you spell . in javascript if you really want to match any character. I had to look that up.)

Categories

Resources