how to split a string based on value in javascript

how to split a string based on value in javascript - javascript

Is there a way to seperate a string into an array based on value? For example if I have a string "1119994444455" how would I turn that into [111, 999,44444,55]? I've attempted doing this but my method doesn't seem to be working.
my code:
var nums = [];
for(i in input){
i = parseInt(i);
if(i - beforeI != 0 && beforeI >= 0){
insertionIndex++;
}
nums[insertionIndex] += i.toString();
console.log(nums[insertionIndex]);
var beforeI = i
}

You can simply use a Regular Expression, like this
console.log("1119994444455".match(/(\d)\1*/g));
// [ '111', '999', '44444', '55' ]
Here, (\d) captures a number and \1* matches zero or more occurrences of the same captured number. The g at the end makes sure we don't stop after finding the first such match.

Related

How to determine matched group's offset in JavaScript's replace? [duplicate]

I want to match a regex like /(a).(b)(c.)d/ with "aabccde", and get the following information back:
"a" at index = 0
"b" at index = 2
"cc" at index = 3
How can I do this? String.match returns list of matches and index of the start of the complete match, not index of every capture.
Edit: A test case which wouldn't work with plain indexOf
regex: /(a).(.)/
string: "aaa"
expected result: "a" at 0, "a" at 2
Note: The question is similar to Javascript Regex: How to find index of each subexpression?, but I cannot modify the regex to make every subexpression a capturing group.

There is currently a proposal (stage 4) to implement this in native Javascript:
RegExp Match Indices for ECMAScript
ECMAScript RegExp Match Indices provide additional information about the start and end indices of captured substrings relative to the start of the input string.
...We propose the adoption of an additional indices property on the array result (the substrings array) of RegExp.prototype.exec(). This property would itself be an indices array containing a pair of start and end indices for each captured substring. Any unmatched capture groups would be undefined, similar to their corresponding element in the substrings array. In addition, the indices array would itself have a groups property containing the start and end indices for each named capture group.
Here's an example of how things would work. The following snippets run without errors in, at least, Chrome:
const re1 = /a+(?<Z>z)?/d;
// indices are relative to start of the input string:
const s1 = "xaaaz";
const m1 = re1.exec(s1);
console.log(m1.indices[0][0]); // 1
console.log(m1.indices[0][1]); // 5
console.log(s1.slice(...m1.indices[0])); // "aaaz"
console.log(m1.indices[1][0]); // 4
console.log(m1.indices[1][1]); // 5
console.log(s1.slice(...m1.indices[1])); // "z"
console.log(m1.indices.groups["Z"][0]); // 4
console.log(m1.indices.groups["Z"][1]); // 5
console.log(s1.slice(...m1.indices.groups["Z"])); // "z"
// capture groups that are not matched return `undefined`:
const m2 = re1.exec("xaaay");
console.log(m2.indices[1]); // undefined
console.log(m2.indices.groups.Z); // undefined
So, for the code in the question, we could do:
const re = /(a).(b)(c.)d/d;
const str = 'aabccde';
const result = re.exec(str);
// indices[0], like result[0], describes the indices of the full match
const matchStart = result.indices[0][0];
result.forEach((matchedStr, i) => {
const [startIndex, endIndex] = result.indices[i];
console.log(`${matchedStr} from index ${startIndex} to ${endIndex} in the original string`);
console.log(`From index ${startIndex - matchStart} to ${endIndex - matchStart} relative to the match start\n-----`);
});
Output:
aabccd from index 0 to 6 in the original string
From index 0 to 6 relative to the match start
-----
a from index 0 to 1 in the original string
From index 0 to 1 relative to the match start
-----
b from index 2 to 3 in the original string
From index 2 to 3 relative to the match start
-----
cc from index 3 to 5 in the original string
From index 3 to 5 relative to the match start
Keep in mind that the indices array contains the indices of the matched groups relative to the start of the string, not relative to the start of the match.
A polyfill is available here.

I wrote MultiRegExp for this a while ago. As long as you don't have nested capture groups, it should do the trick. It works by inserting capture groups between those in your RegExp and using all the intermediate groups to calculate the requested group positions.
var exp = new MultiRegExp(/(a).(b)(c.)d/);
exp.exec("aabccde");
should return
{0: {index:0, text:'a'}, 1: {index:2, text:'b'}, 2: {index:3, text:'cc'}}
Live Version

I created a little regexp Parser which is also able to parse nested groups like a charm. It's small but huge. No really. Like Donalds hands. I would be really happy if someone could test it, so it will be battle tested. It can be found at: https://github.com/valorize/MultiRegExp2
Usage:
let regex = /a(?: )bc(def(ghi)xyz)/g;
let regex2 = new MultiRegExp2(regex);
let matches = regex2.execForAllGroups('ababa bcdefghixyzXXXX'));
Will output:
[ { match: 'defghixyz', start: 8, end: 17 },
{ match: 'ghi', start: 11, end: 14 } ]

Updated Answer: 2022
See String.prototype.matchAll
The matchAll() method matches the string against a regular expression and returns an iterator of matching results.
Each match is an array, with the matched text as the first item, and then one item for each parenthetical capture group. It also includes the extra properties index and input.
let regexp = /t(e)(st(\d?))/g;
let str = 'test1test2';
for (let match of str.matchAll(regexp)) {
console.log(match)
}
// => ['test1', 'e', 'st1', '1', index: 0, input: 'test1test2', groups: undefined]
// => ['test2', 'e', 'st2', '2', index: 5, input: 'test1test2', groups: undefined]

Based on the ecma regular expression syntax I've written a parser respective an extension of the RegExp class which solves besides this problem (full indexed exec method) as well other limitations of the JavaScript RegExp implementation for example: Group based search & replace. You can test and download the implementation here (is as well available as NPM module).
The implementation works as follows (small example):
//Retrieve content and position of: opening-, closing tags and body content for: non-nested html-tags.
var pattern = '(<([^ >]+)[^>]*>)([^<]*)(<\\/\\2>)';
var str = '<html><code class="html plain">first</code><div class="content">second</div></html>';
var regex = new Regex(pattern, 'g');
var result = regex.exec(str);
console.log(5 === result.length);
console.log('<code class="html plain">first</code>'=== result[0]);
console.log('<code class="html plain">'=== result[1]);
console.log('first'=== result[3]);
console.log('</code>'=== result[4]);
console.log(5=== result.index.length);
console.log(6=== result.index[0]);
console.log(6=== result.index[1]);
console.log(31=== result.index[3]);
console.log(36=== result.index[4]);
I tried as well the implementation from #velop but the implementation seems buggy for example it does not handle backreferences correctly e.g. "/a(?: )bc(def(\1ghi)xyz)/g" - when adding paranthesis in front then the backreference \1 needs to be incremented accordingly (which is not the case in his implementation).

So, you have a text and a regular expression:
txt = "aabccde";
re = /(a).(b)(c.)d/;
The first step is to get the list of all substrings that match the regular expression:
subs = re.exec(txt);
Then, you can do a simple search on the text for each substring. You will have to keep in a variable the position of the last substring. I've named this variable cursor.
var cursor = subs.index;
for (var i = 1; i < subs.length; i++){
sub = subs[i];
index = txt.indexOf(sub, cursor);
cursor = index + sub.length;
console.log(sub + ' at index ' + index);
}
EDIT: Thanks to #nhahtdh, I've improved the mechanism and made a complete function:
String.prototype.matchIndex = function(re){
var res = [];
var subs = this.match(re);
for (var cursor = subs.index, l = subs.length, i = 1; i < l; i++){
var index = cursor;
if (i+1 !== l && subs[i] !== subs[i+1]) {
nextIndex = this.indexOf(subs[i+1], cursor);
while (true) {
currentIndex = this.indexOf(subs[i], index);
if (currentIndex !== -1 && currentIndex <= nextIndex)
index = currentIndex + 1;
else
break;
}
index--;
} else {
index = this.indexOf(subs[i], cursor);
}
cursor = index + subs[i].length;
res.push([subs[i], index]);
}
return res;
}
console.log("aabccde".matchIndex(/(a).(b)(c.)d/));
// [ [ 'a', 1 ], [ 'b', 2 ], [ 'cc', 3 ] ]
console.log("aaa".matchIndex(/(a).(.)/));
// [ [ 'a', 0 ], [ 'a', 1 ] ] <-- problem here
console.log("bababaaaaa".matchIndex(/(ba)+.(a*)/));
// [ [ 'ba', 4 ], [ 'aaa', 6 ] ]

I'm not exactly sure exactly what your requirements are for your search, but here's how you could get the desired output in your first example using Regex.exec() and a while-loop.
JavaScript
var myRe = /^a|b|c./g;
var str = "aabccde";
var myArray;
while ((myArray = myRe.exec(str)) !== null)
{
var msg = '"' + myArray[0] + '" ';
msg += "at index = " + (myRe.lastIndex - myArray[0].length);
console.log(msg);
}
Output
"a" at index = 0
"b" at index = 2
"cc" at index = 3
Using the lastIndex property, you can subtract the length of the currently matched string to obtain the starting index.

issue with substring indexing

Instructions for this kata:
In this Kata, we will check if a string contains consecutive letters as they appear in the English alphabet and if each letter occurs only once.
It seems that my code is indexing the strings differently per function call on this one. for example, on the first test "abcd", the starting index is shown as 0, which is correct, and on the second example, "himjlk", the
var subString = alphabet.substring(startIndex, length);
returns "g", instead of "h"
troubleshooting this section
var length = orderedString.length;
//startChar for string comparison
var startChar = orderedString.charAt(0);
//find index in aphabet of first character in orderedString.
var startIndex = alphabet.indexOf(startChar);
//create substring of alphabet with start index of orderedString and //orderedString.length
var subString = alphabet.substring(startIndex, length);
function solve(s) {
//alphabet string to check against
const alphabet = `abcdefghijklmnopqrstuvwxyz`;
//check s against alphabet
//empty array to order input string
var ordered = [];
//iterate through alphabet, checking against s
//and reorder input string to be alphabetized
for (var z in alphabet) {
var charToCheck = alphabet[z];
for (var i in s) {
if (charToCheck === s[i]) {
ordered.push(s[i]);
}
//break out of loop if lengths are the same
if (ordered.length === s.length) {
break;
}
}
if (ordered.length === s.length) {
break;
}
}
//join array back into string
var orderedString = ordered.join(``);
//length for future alphabet substring for comparison
var length = orderedString.length;
//startChar for string comparison
var startChar = orderedString.charAt(0);
//find index in aphabet of first character in orderedString.
var startIndex = alphabet.indexOf(startChar);
//create substring of alphabet with start index of orderedString and orderedString.length
var subString = alphabet.substring(startIndex, length);
//return if the two are a match
return subString == orderedString ? true : false;
}
console.log(solve("abdc")); //expected `true`
console.log(solve("himjlk")); // expected `true`
console.log(solve("abdc")); should provide the substring "abcd" and return true, which it does.
console.log(solve("himjlk")); should put together "hijklm" and return true, but instead gives me g based on index 6 of alphabet, not sure why it's doing this, should be index 7 "h" returns false based upon this error.

The problem is that you're using substring() instead of substr(). Though that might sound similar there's a difference.
With substring the second parameter doesn't determine the length as you might have expected. It's actually the index to stop.
That your function works as expected with the string abcd is pure coincidence since in this case the length from index 0 and the end index are the same.
function solve(s){
const alphabet = `abcdefghijklmnopqrstuvwxyz`;
var ordered = [];
for(var z in alphabet){
var charToCheck = alphabet[z];
for(var i in s){
if(charToCheck === s[i]){
ordered.push(s[i]);
}
if(ordered.length === s.length){ break; }
}
if(ordered.length === s.length){ break; }
}
var orderedString = ordered.join(``);
var length = orderedString.length;
var startChar = orderedString.charAt(0);
var startIndex = alphabet.indexOf(startChar);
var subString = alphabet.substr(startIndex, length);
return subString == orderedString ? true: false;
}
console.log(solve("himjlk"));

You approach is also correct. I am giving another solution using sort() and charCodeAt. Instead of getting the index and then breaking string into parts to compare just use includes()
function check(str){
let org = [...Array(26)].map((x,i) => String.fromCharCode(i + 97)).join('');
str = str.split('').sort((a, b) => a.charCodeAt(0) - b.charCodeAt(0)).join('');
return org.includes(str);
}
console.log(check("abdc"))//true
console.log(check("himjlk"));//true
console.log(check("himjlkp"));//false
Explanation:
Frist Line:
let org = [...Array(26)].map((x,i) => String.fromCharCode(i + 97)).join('');
is use to create string "abcd....xyz".
[...Array(26)] will create an array of 26(no of alphabets) undefined values.
map() is a function which takes a callback and the create an array based the values of previous. The first parameter of map() callback x is the value itself which will be undefined(because all the values in array are undefined).
i the second parameter will be the index of the element. Which will start from 0 upto 25.
String.fromCharCode is function which takes a character code(integer) and then convert it to string. For example character code for a is 97 so String.fromCharCode(97) will return "a". 98 for "b", 99 for "c" etc.
So after map() an array like ["a","b"....,"z"] will be generated.
-join() will convert that to string
Second Line:
str is given string. str.split('') will convert string to array. For example
if str is "abdc" it will return ["a","b","d","c"]
sort() is the array method which takes the callback. The two parameters are two values to be compared during sort(). a and b are two values.
charCodeAt acts in reverse as String.fromCharCode. For example "a".charCodeAt(0) will be return 97 for "b" it will 98 and so on.
a.charCodeAt(0) - b.charCodeAt(0) which is returned from sort() will sort array is ascending order. And join() will convert array to string.
So string "abdc" will become "abcd"
Third Line:
The third line is the main one. org is string "abcdefghijklmnopqrstuvwxyz". Now if any string is a substring of this string then it means its in alphabetical order. So we check the sorted str is includes in the string or not.
You can clean up the second line by
str = str.split('').sort().join('');
Because if no callback is passed to sort() it will sort in default order. Mean alphabetical order.

How can I sum all numbers 1 from a string?

I need to sum all numbers 1 from a string!
For example: "00110010" = 1+1+1 = 3...
psum will receive this result and then I will check
if(psum >= 3){
return person;
}
I need to find a way to solve it in javascript ES6 but I can't use any for, while or forEach loop, unfortunately!!!
Could you help me?

You need to use the reduce() method.
let input = '00110010'
let array = input.split("").map(x => parseInt(x));
let sum = array.reduce((acc, val) => {
return acc + val;
});
console.log(sum)

In one statement:
let psum = "00110010".split('').reduce((t, n) => {return t + parseInt(n)}, 0);
console.log(psum);

Note that summing the numbers 1 comes down to counting the numbers 1, which is what the following solutions do in different ways:
With match
You could use a regular expression /1/g:
var p = "00110010";
var psum = (p.match(/1/g) || []).length;
console.log(psum);
match returns an array of substrings that match with the pattern 1. The / just delimit this regular expression, and the g means that all matches should be retrieved (global). The length of the returned array thus corresponds to the number of 1s in the input. If there are no matches at all, then match will return null, so that does not have a .length property. To take care of that || [] will check for that null (which is falsy in a boolean expression) and so [] will be taken instead of null.
With replace
This is a similar principle, but by matching non-1 characters and removing them:
var p = "00110010";
var psum = p.replace(/[^1]/g, "").length;
console.log(psum);
[^1] means: a character that is not 1. replace will replace all matches with the second argument (empty string), which comes down to returning all characters that do not match. This is like a double negative: return characters that do not match with not 1. So you get only the 1s :-) .length will count those.
With split:
var p = "00110010";
var psum = p.split("1").length - 1;
console.log(psum);
split splits the string into an array of substrings that do not have the given substring ("1"). So even if there are no "1" at all, you get one such substring (the whole string). This means that by getting the length, we should reduce it by 1 to get the number of 1s.
With a recursive function:
var p = "00110010";
var count1 = p => p.length && ((p[0] == "1") + count1(p.slice(1)));
var psum = count1(p);
console.log(psum);
Here the function count1 is introduced. It first checks if the given string p is empty. If so, length is zero, and that is returned. If not empty, the first character is compared with 1. This can be false or true. This result is converted to 0 or 1 respectively and added to a recursive call result. That recursive call counts the number 1s in the rest of the input (excluding the first character in which the 1s were already counted).

Separate value from string using javascript

I have a string in which every value is between [] and it has a . at the end. How can I separate all values from the string?
This is the example string:
[value01][value02 ][value03 ]. [value04 ]
//want something like this
v1 = value01;
v2 = value02;
v3 = value03;
v4 = value04
The number of values is not constant. How can I get all values separately from this string?

Use regular expressions to specify multiple separators. Please check the following posts:
How do I split a string with multiple separators in javascript?
Split a string based on multiple delimiters
var str = "[value01][value02 ][value03 ]. [value04 ]"
var arr = str.split(/[\[\]\.\s]+/);
arr.shift(); arr.pop(); //discard the first and last "" elements
console.log( arr ); //output: ["value01", "value02", "value03", "value04"]
JS FIDDLE DEMO
How This Works
.split(/[\[\]\.\s]+/) splits the string at points where it finds one or more of the following characters: [] .. Now, since these characters are also found at the beginning and end of the string, .shift() discards the first element, and .pop() discards the last element, both of which are empty strings. However, your may want to use .filter() and your can replace lines 2 and 3 with:
var arr = str.split(/[\[\]\.\s]+/).filter(function(elem) { return elem.length > 0; });
Now you can use jQuery/JS to iterate through the values:
$.each( arr, function(i,v) {
console.log( v ); // outputs the i'th value;
});
And arr.length will give you the number of elements you have.

If you want to get the characters between "[" and "]" and the data is regular and always has the pattern:
'[chars][chars]...[chars]'
then you can get the chars using match to get sequences of characters that aren't "[" or "]":
var values = '[value01][value02 ][value03 ][value04 ]'.match(/[^\[\]]+/g)
which returns an array, so values is:
["value01", "value02 ", "value03 ", "value04 "]
Match is very widely supported, so no cross browser issues.

Here's a fiddle: http://jsfiddle.net/5xVLQ/
Regex patern: /(\w)+/ig
Matches all words using \w (alphanumeric combos). Whitespace, brackets, dots, square brackets are all non-matching, so they don't get returned.
What I do is create a object to hold results in key/value pairs such as v1:'value01'. You can iterate through this object, or you can access the values directly using objRes.v1
var str = '[value01][value02 ][value03 ]. [value04 ]';
var myRe = /(\w)+/ig;
var res;
var objRes = {};
var i=1;
while ( ( res = myRe.exec(str) ) != null )
{
objRes['v'+i] = res[0];
i++;
}
console.log(objRes);

how to retrieve a string between to same charecter

I know how to use substring() but here I have a problem, I'd like to retrieve a number between two "_" from a unknown string length. here is my string for example.
7_28_li
and I want to get the 28. How can I proceed to do so ?
Thanks.

Regex
'7_28_li'.match(/_(\d+)_/)[1]
The slashes inside match make it's contents regex.
_s are taken literally
( and ) are for retrieving the contents (the target number) later
\d is a digit character
+ is "one or more".
The [1] on the end is accesses what got matched from the first set of parens, the one or more (+) digits (\d).
Loop
var str = '7_28_li';
var state = 0; //How many underscores have gone by
var num = '';
for (var i = 0; i < str.length; i++) {
if (str[i] == '_') state++;
else if (state == 1) num += str[i];
};
num = parseInt(num);
Probably more efficient, but kind of long and ugly.
Split
'7_28_li'.split('_')[1]
Split it into an array, then get the second element.
IndexOf
var str = "7_28_li";
var num = str.substring(str.indexOf('_') + 1, str.indexOf('_', 2));
Get the start and end point. Uses the little-known second parameter of indexOf. This works better than lastIndexOf because it is guaranteed to give the first number between _s, even when there are more than 2 underscores.

First find the index of _, and then find the next position of _. Then get the substring between them.
var data = "7_28_li";
var idx = data.indexOf("_");
console.log(data.substring(idx + 1, data.indexOf("_", idx + 1)));
# 28
You can understand that better, like this
var data = "7_28_li";
var first = data.indexOf("_");
var next = data.indexOf("_", first + 1);
console.log(data.substring(first + 1, next));
# 28
Note: The second argument to indexOf is to specify where to start looking from.

Probably the easiest way to do it is to call split on your string, with your delimiter ("_" in this case) as the argument. It'll return an array with 7, 28, and li as elements, so you can select the middle one.
"7_28_li".split("_")[1]
This will work if it'll always be 3 elements. If it's more, divide the length property by 2 and floor it to get the right element.
var splitstring = "7_28_li".split("_")
console.log(splitstring[Math.floor(splitstring.length/2)]);
I'm not sure how you want to handle even length strings, but all you have to do is set up an if statement and then do whatever you want.

If you know there would be 2 underscore, you can use this
var str = "7_28_li";
var res = str.substring(str.indexOf("_") +1, str.lastIndexOf("_"));
If you want to find the string between first 2 underscores
var str = "7_28_li";
var firstIndex = str.indexOf("_");
var secondIndex = str.indexOf("_", firstIndex+1);
var res = str.substring(firstIndex+1, secondIndex);

Develop Reference

JavaScript is the programming language of the Web.

how to split a string based on value in javascript - javascript

Related

How to determine matched group's offset in JavaScript's replace? [duplicate]

issue with substring indexing

How can I sum all numbers 1 from a string?

Separate value from string using javascript

how to retrieve a string between to same charecter

Categories

Resources