Split a string using regex - javascript

I have a string and I want is split into an array so that it is split by '+' unless it is inside brackets
E.g. the string
"abc+OR+def+OR+(abc+AND+def)"
becomes
["abc", "OR", "def", "OR", "(abc+AND+def)"]
and the string
"(abc+AND+cde)+OR+(abc+AND+(cde+AND+fgh))"
becomes
["(abc+AND+cde)", "OR", "(abc+AND+(cde+AND+fgh)"]
Is it possible to do this using regular expressions?

You can do this with regex, but only with that languages that support recursive regular expression (for example, perl or any language wit PCRE).
It is not easy with JavaScript regexes, because they do not support recursion.
But it is possible using XRegExp using additional plugin:
http://xregexp.com/plugins/#matchRecursive
Also please check these two links:
http://blog.stevenlevithan.com/archives/regex-recursion
http://blog.stevenlevithan.com/archives/javascript-match-nested

I don't think you could do this with regex. EDIT: per Silver, you could use regex.
One way would be to just parse the string character by character. I'll edit my answer with code in a minute.
EDIT: Here's a sample implementation (note: untested, may have a bug or two):
function parseString (str) {
var splitStr = [], parentheses = 0, i = 0
for (var j = 0; j < str.length; j++) {
if (str[j] == '+' && !parentheses)
i++
else if (str[j] == '(')
parentheses++
else if (str[j] == ')')
parentheses--
else
splitStr[i] += str[j]
}
return splitStr
}

You can use the match method of String object to do this and use the following regex:
stringObj.match(/([a-zA-Z]+)|([(]([a-zA-Z]+[+])+[a-zA-Z]+[)])+/gi);

This regular expression would suit your needs.
(?!=\([\w\+]+)\+(?![\w+\+]+\))
See it in action here.
There is one small problem: Negative lookbehind (?!=...) is not implemented in the javascript regular expression parser.
For anyone who is learning regular expressions, here is a walkthrough:
(?!=\([\w\+]+) is a negative lookbehind. It means "not preceeded by ..." In this case, we're looking for something not preceeded by (lettersOr+.
\+ is what we are looking for. A plus sign (escaped)
(?![\w+\+]+\)) is a negative lookahead. It means "not followed by ..." In this case, we're looking for something not followed by lettersOr+)

This function should work for you:
var PARENTH_STRING_PLACE_HOLDER = '__PARSTRINGHOLDER__';
var splitPlusNoParenthesis = function(str){
//Replace the parenthStrings with the placeholder
var parenthStrings = getParenthesizedStrings(str);
for(var i = 0; i < parenthStrings.length; i++){
str = str.replace(parenthStrings[i], PARENTH_STRING_PLACE_HOLDER);
}
//Split on '+'
var splitString = str.split('+');
//Replace all placeholders with the actual values
var parIndex = 0;
for(var i = 0; i < splitString.length; i++){
if(splitString[i] === PARENTH_STRING_PLACE_HOLDER){
splitString[i] = parenthStrings[parIndex++];
}
}
return splitString;
};
var getParenthesizedStrings = function(str){
var parenthStrings = [];
for(var startIndex = 0; startIndex < str.length; startIndex++){
if(str[startIndex] === '('){
var parenthCount = 1;
var endIndex = startIndex + 1;
for(; endIndex < str.length; endIndex++){
var character = str[endIndex];
if(character === '('){
parenthCount++;
} else if(character === ')'){
parenthCount--;
}
if(!parenthCount){
parenthStrings.push(str.substring(startIndex, endIndex + 1));
break;
}
}
startIndex = endIndex;
}
}
return parenthStrings;
};
Here's a fiddle to test.

Related

What is wrong with the logic of my character changing function?

I've tried to create a character changing function for strings, it suppose to change all the "-" to "_", and it only does it for the first character and leaves the rest. If someone could explain it would be grate.
function kebabToSnake(str) {
var idNum = str.length;
for(var i = 0; i <= idNum; i++) {
var nStr = str.replace("-", "_");
}
return nStr;
}
var nStr = str.replace("-", "_");
So, on each iteration, you're replacing the first found - character in the original string, not the string that you've already replaced characters from already. You can either call .replace on just one variable that you reassign:
function kebabToSnake(str) {
var idNum = str.length;
for(var i = 0; i < idNum; i++) {
str = str.replace("-", "_");
}
return str;
}
console.log(kebabToSnake('ab-cd-ef'));
(note that you should iterate from 0 to str.length - 1, not from 0 to str.length)
Or, much, much more elegantly, use a global regular expression:
function kebabToSnake(str) {
return str.replace(/-/g, '_');
}
console.log(kebabToSnake('ab-cd-ef'));

Compress characters aabbbcccc+++ to a#2b#3c#4+#3 in javascript

the above question is asked at an interview, the code must accept input like aabbbcccc+++ and should output a#2b#3c#4+#3 based on the number of strings occurrences.
You can use regex and captured in replace function.
(.)\1+ - Here . means match anything \1+ this means match the same character match by (.) one or more time. Than in the callback function we are returning concatenation first and length of match and #
let str = `aabbbcccc+++`
let op = str.replace(/(.)\1+/g, function(match,first){
return first+'#'+match.length;
})
console.log(op)
You can try regex method or you can use the snippet below
Basic loop
function compress(str) {
let newstr = "";
let count = 1;
let index = 0;
for (let i = 0; i <= str.length; i++) {
if (str.charAt(i) === str.charAt(i + 1)) {
count += 1;
} else {
newstr += `${str.charAt(i)}#${count}`;
count = 1;
}
}
console.log(newstr);
}
compress("aaaabbbbbccccc++++");
Use the regex method using above snippet https://stackoverflow.com/a/54326492/7444617

How to replace all same charter/string in text with different outcomes?

For example let's say I want to attach the index number of each 's' in a string to the 's's.
var str = "This is a simple string to test regex.";
var rm = str.match(/s/g);
for (let i = 0;i < rm.length ;i++) {
str = str.replace(rm[i],rm[i]+i);
}
console.log(str);
Output: This43210 is a simple string to test regex.
Expected output: This0 is1 a s2imple s3tring to tes4t regex.
I'd suggest, using replace():
let i = 0,
str = "This is a simple string to test regex.",
// result holds the resulting string after modification
// by String.prototype.replace(); here we use the
// anonymous callback function, with Arrow function
// syntax, and return the match (the 's' character)
// along with the index of that found character:
result = str.replace(/s/g, (match) => {
return match + i++;
});
console.log(result);
Corrected the code with the suggestion — in comments — from Ezra.
References:
Arrow functions.
"Regular expressions," from MDN.
String.prototype.replace().
For something like this, I would personally go with the split and test method. For example:
var str = "This is a simple string to test regex.";
var split = str.split(""); //Split out every char
var recombinedStr = "";
var count = 0;
for(let i = 0; i < split.length; i++) {
if(split[i] == "s") {
recombinedStr += split[i] + count;
count++;
} else {
recombinedStr += split[i];
}
}
console.log(recombinedStr);
A bit clunky, but works. It forgoes using regex statements though, so probably not exactly what you're looking for.

put dash after every n character during input from keyboard

$('.creditCardText').keyup(function() {
var foo = $(this).val().split("-").join(""); // remove hyphens
if (foo.length > 0) {
foo = foo.match(new RegExp('.{1,4}', 'g')).join("-");
}
$(this).val(foo);
});
I found this tutorial on putting dash after every 4 character from here my question is what if the character interval is not constant like in this example it is only after every 4 what if the interval is 3 characters "-" 2 characters "-" 4 characters "-" 3 characters "-" so it would appear like this 123-12-1234-123-123.
In this case, it is more convenient to just write normal code to solve the problem:
function format(input, format, sep) {
var output = "";
var idx = 0;
for (var i = 0; i < format.length && idx < input.length; i++) {
output += input.substr(idx, format[i]);
if (idx + format[i] < input.length) output += sep;
idx += format[i];
}
output += input.substr(idx);
return output;
}
Sample usage:
function format(input, format, sep) {
var output = "";
var idx = 0;
for (var i = 0; i < format.length && idx < input.length; i++) {
output += input.substr(idx, format[i]);
if (idx + format[i] < input.length) output += sep;
idx += format[i];
}
output += input.substr(idx);
return output;
}
$('.creditCardText').keyup(function() {
var foo = $(this).val().replace(/-/g, ""); // remove hyphens
// You may want to remove all non-digits here
// var foo = $(this).val().replace(/\D/g, "");
if (foo.length > 0) {
foo = format(foo, [3, 2, 4, 3, 3], "-");
}
$(this).val(foo);
});
<script src="https://ajax.googleapis.com/ajax/libs/jquery/1.11.1/jquery.min.js"></script>
<input class="creditCardText" />
While it is possible to do partial matching and capturing with regex, the replacement has to be done with a replacement function. In the replacment function, we need to determine how many capturing group actually captures some text. Since there is no clean solution with regex, I write a more general function as shown above.
You can split it using a regular expression. In this case, I'm using a expression to check for non-spaces with interval 3-2-4-3.
The RegExp.exec will return with a "match" array, with the first element containing the actual string. After removing the first element of the match, you can then join them up with dashes.
var mystring = "123121234123"
var myRegexp = /^([^\s]{3})([^\s]{2})([^\s]{4})([^\s]{3})$/g
var match = myRegexp.exec(mystring);
if (match)
{
match.shift();
mystring = match.join("-")
console.log(mystring)
}
Per further comments, the op clarified they need a fixed interval for when to insert dashes. In that case, there are several ways to implement it; I think regular expression would probably be the worst, in other words, overkill and overly complication solution.
Some simpler options would be to create a new character array, and in a loop append character by character, adding a dash too every time you get to the index you want. This would probably be the easiest to write and grok after the fact, but a little more verbose.
Or you could convert to a character array and use an 'insert into array at index'-type function like splice() (see Insert Item into Array at a Specific Index or Inserting string at position x of another string for some examples).
Pass the input value and the indexes to append the separator, first, it will remove the existing separators then just append separators on positions indexes.
export function addSeparators(
input: string,
positions: number[],
separator: string
): string {
const inputValue = input.replace(/-/g, '').split(''); // remove existing separators and split characters into array
for (let i = 0; i < inputValue.length; i++) {
if (positions.includes(i)) inputValue.splice(i, 0, separator);
}
return inputValue.join('');
}

Counting vowels in javascript

I use this code to search and count vowels in the string,
a = "run forest, run";
a = a.split(" ");
var syl = 0;
for (var i = 0; i < a.length - 1; i++) {
for (var i2 = 0; i2 < a[i].length - 1; i2++) {
if ('aouie'.search(a[i][i2]) > -1) {
syl++;
}
}
}
alert(syl + " vowels")
Obviously it should alert up 4 vowels, but it returns 3.
What's wrong and how you can simplify it?
Try this:
var syl = ("|"+a+"|").split(/[aeiou]/i).length-1;
The | ensures there are no edge cases, such as having a vowel at the start or end of the string.
Regarding your code, your if condition needs no i2
if('aouie'.search(a[i]) > -1){
I wonder, why all that use of arrays and nested loops, the below regex could do it better,
var str = "run forest, run";
var matches = str.match(/[aeiou]/gi);
var count = matches ? matches.length : 0;
alert(count + " vowel(s)");
Demo
Try:
a = "run forest, run";
var syl = 0;
for(var i=0; i<a.length; i++) {
if('aouie'.search(a[i]) > -1){
syl++;
}
}
alert(syl+" vowels")
First, the split is useless since you can already cycle through every character.
Second: you need to use i<a.length, this gets the last character in the string, too.
The simplest way is
s.match(/[aeiou]/gi).length
You can use the .match to compare a string to a regular expression. g is global which will run through the entire string. i makes the string readable as upper and lower case.
function getVowels(str) {
var m = str.match(/[aeiou]/gi);
return m === null ? 0 : m.length;
}

Categories

Resources