Separating words with Regex - javascript

I am trying to get this result: 'Summer-is-here'. Why does the code below generate extra spaces? (Current result: '-Summer--Is- -Here-').
function spinalCase(str) {
var newA = str.split(/([A-Z][a-z]*)/).join("-");
return newA;
}
spinalCase("SummerIs Here");

You are using a variety of split where the regexp contains a capturing group (inside parentheses), which has a specific meaning, namely to include all the splitting strings in the result. So your result becomes:
["", "Summer", "", "Is", " ", "Here", ""]
Joining that with - gives you the result you see. But you can't just remove the unnecessary capture group from the regexp, because then the split would give you
["", "", " ", ""]
because you are splitting on zero-width strings, due to the * in your regexp. So this doesn't really work.
If you want to use split, try splitting on zero-width or space-only matches looking ahead to a uppercase letter:
> "SummerIs Here".split(/\s*(?=[A-Z])/)
^^^^^^^^^ LOOK-AHEAD
< ["Summer", "Is", "Here"]
Now you can join that to get the result you want, but without the lowercase mapping, which you could do with:
"SummerIs Here" .
split(/\s*(?=[A-Z])/) .
map(function(elt, i) { return i ? elt.toLowerCase() : elt; }) .
join('-')
which gives you want you want.
Using replace as suggested in another answer is also a perfectly viable solution. In terms of best practices, consider the following code from Ember:
var DECAMELIZE_REGEXP = /([a-z\d])([A-Z])/g;
var DASHERIZE_REGEXP = /[ _]/g;
function decamelize(str) {
return str.replace(DECAMELIZE_REGEXP, '$1_$2').toLowerCase();
}
function dasherize(str) {
return decamelize(str).replace(DASHERIZE_REGEXP, '-');
}
First, decamelize puts an underscore _ in between two-character sequences of lower-case letter (or digit) and upper-case letter. Then, dasherize replaces the underscore with a dash. This works perfectly except that it lower-cases the first word in the string. You can sort of combine decamelize and dasherize here with
var SPINALIZE_REGEXP = /([a-z\d])\s*([A-Z])/g;
function spinalCase(str) {
return str.replace(SPINALIZE_REGEXP, '$1-$2').toLowerCase();
}

You want to separate capitalized words, but you are trying to split the string on capitalized words that's why you get those empty strings and spaces.
I think you are looking for this :
var newA = str.match(/[A-Z][a-z]*/g).join("-");

([A-Z][a-z]*) *(?!$|[a-z])
You can simply do a replace by $1-.See demo.
https://regex101.com/r/nL7aZ2/1
var re = /([A-Z][a-z]*) *(?!$|[a-z])/g;
var str = 'SummerIs Here';
var subst = '$1-';
var result = str.replace(re, subst);

var newA = str.split(/ |(?=[A-Z])/).join("-");
You can change the regex like:
/ |(?=[A-Z])/ or /\s*(?=[A-Z])/
Result:
Summer-Is-Here

Related

split mixed description hashtags and text

I have a string like
var string = "#developers must split #hashtags";
I want to split it when a word starts with # symbol
I tried these two examples
var example1 = string.split(/(?=#)/g);
//result is ["#developers must split ", "#hashtags"]
var example2 = string.split(/(?:^|[ ])#([a-zA-Z]+)/g);
// result is ["", "developers", "must split", "hashtags", ""]
Result must looks like this
var description = ["#developers", "must split", "#hashtags"]
JSFiddle example
I have a solution but it is a bit long, I want it short with regex. thank you,
When you split, the captured groups are included in the split results array. So you can capture the #word delimiter and omit the space before and after the delimiter with an expression like \s*(#\S+)\s*. Omit empty strings by filter-ing on an expression that tests the truthiness of each string (e.g.: x => x).
let result = "#developers must split #hashtags".split(/\s*(#\S+)\s*/g).filter(x => x);
console.log(result);

JavaScript: Amend the Sentence

I am having trouble below javaScript problem.
Question:
You have been given a string s, which is supposed to be a sentence. However, someone forgot to put spaces between the different words, and for some reason they capitalized the first letter of every word. Return the sentence after making the following amendments:
Put a single space between the words.
Convert the uppercase letters to lowercase.
Example
"CodefightsIsAwesome", the output should be "codefights is awesome";
"Hello", the output should be "hello".
My current code is:
Right now, my second for-loop just manually slices the parts from the string.
How can I make this dynamic and insert "space" in front of the Capital String?
You can use String.prototype.match() with RegExp /[A-Z][^A-Z]*/g to match A-Z followed by one or more characters which are not A-Z, or character at end of string; chain Array.prototype.map() to call .toLowerCase() on matched words, .join() with parameter " " to include space character between matches at resulting string.
var str = "CodefightsIsAwesome";
var res = str.match(/[A-Z][^A-Z]*/g).map(word => word.toLowerCase()).join(" ");
console.log(res);
Alternatively, as suggested by #FissureKing, you can use String.prototype.repalce() with .trim() and .toLowerCase() chained
var str = "CodefightsIsAwesome";
var res = str.replace(/[A-Z][^A-Z]*/g, word => word + ' ').trim().toLowerCase();
console.log(res);
Rather than coding a loop, I'd do it in one line with a (reasonably) simple string replacement:
function amendTheSentence(s) {
return s.replace(/[A-Z]/g, function(m) { return " " + m.toLowerCase() })
.replace(/^ /, "");
}
console.log(amendTheSentence("CodefightsIsAwesome"));
console.log(amendTheSentence("noCapitalOnFirstWord"));
console.log(amendTheSentence("ThereIsNobodyCrazierThanI"));
That is, match any uppercase letter with the regular expression /[A-Z]/, replace the matched letter with a space plus that letter in lowercase, then remove any space that was added at the start of the string.
Further reading:
String .replace() method
Regular expressions
We can loop through once.
The below assumes the very first character should always be capitalized in our return array. If that is not true, simply remove the first if block from below.
For each character after that, we check to see if it is capitalized. If so, we add it to our return array, prefaced with a space. If not, we add it as-is into our array.
Finally, we join the array back into a string and return it.
const sentence = "CodefightsIsAwesome";
const amend = function(s) {
ret = [];
for (let i = 0; i < s.length; i++) {
const char = s[i];
if (i === 0) {
ret.push(char.toUpperCase());
} else if (char.toUpperCase() === char) {
ret.push(` ${char.toLowerCase()}`);
} else {
ret.push(char);
}
}
return ret.join('');
};
console.log(amend(sentence));

Split string by all spaces except those in parentheses

I'm trying to split text the following like on spaces:
var line = "Text (what is)|what's a story|fable called|named|about {Search}|{Title}"
but I want it to ignore the spaces within parentheses. This should produce an array with:
var words = ["Text", "(what is)|what's", "a", "story|fable" "called|named|about", "{Search}|{Title}"];
I know this should involve some sort of regex with line.match(). Bonus points if the regex removes the parentheses. I know that word.replace() would get rid of them in a subsequent step.
Use the following approach with specific regex pattern(based on negative lookahead assertion):
var line = "Text (what is)|what's a story|fable called|named|about {Search}|{Title}",
words = line.split(/(?!\(.*)\s(?![^(]*?\))/g);
console.log(words);
(?!\(.*) ensures that a separator \s is not preceded by brace ((including attendant characters)
(?![^(]*?\)) ensures that a separator \s is not followed by brace )(including attendant characters)
Not a single regexp but does the job. Removes the parentheses and splits the text by spaces.
var words = line.replace(/[\(\)]/g,'').split(" ");
One approach which is useful in some cases is to replace spaces inside parens with a placeholder, then split, then unreplace:
var line = "Text (what is)|what's a story|fable called|named|about {Search}|{Title}";
var result = line.replace(/\((.*?)\)/g, m => m.replace(' ', 'SPACE'))
.split(' ')
.map(x => x.replace(/SPACE/g, ' '));
console.log(result);

How to Split string with multiple rules in javascript

I have this string for example:
str = "my name is john#doe oh.yeh";
the end result I am seeking is this Array:
strArr = ['my','name','is','john','&#doe','oh','&yeh'];
which means 2 rules apply:
split after each space " " (I know how)
if there are special characters ("." or "#") then also split but add the characther "&" before the word with the special character.
I know I can strArr = str.split(" ") for the first rule. but how do I do the other trick?
thanks,
Alon
Assuming the result should be '&doe' and not '&#doe', a simple solution would be to just replace all . and # with & split by spaces:
strArr = str.replace(/[.#]/g, ' &').split(/\s+/)
/\s+/ matches consecutive white spaces instead of just one.
If the result should be '&#doe' and '&.yeah' use the same regex and add a capture:
strArr = str.replace(/([.#])/g, ' &$1').split(/\s+/)
You have to use a Regular expression, to match all special characters at once. By "special", I assume that you mean "no letters".
var pattern = /([^ a-z]?)[a-z]+/gi; // Pattern
var str = "my name is john#doe oh.yeh"; // Input string
var strArr = [], match; // output array, temporary var
while ((match = pattern.exec(str)) !== null) { // <-- For each match
strArr.push( (match[1]?'&':'') + match[0]); // <-- Add to array
}
// strArr is now:
// strArr = ['my', 'name', 'is', 'john', '&#doe', 'oh', '&.yeh']
It does not match consecutive special characters. The pattern has to be modified for that. Eg, if you want to include all consecutive characters, use ([^ a-z]+?).
Also, it does nothing include a last special character. If you want to include this one as well, use [a-z]* and remove !== null.
use split() method. That's what you need:
http://www.w3schools.com/jsref/jsref_split.asp
Ok. i saw, you found it, i think:
1) first use split to the whitespaces
2) iterate through your array, split again in array members when you find # or .
3) iterate through your array again and str.replace("#", "&#") and str.replace(".","&.") when you find
I would think a combination of split() and replace() is what you are looking for:
str = "my name is john#doe oh.yeh";
strArr = str.replace('\W',' &');
strArr = strArr.split(' ');
That should be close to what you asked for.
This works:
array = string.replace(/#|\./g, ' &$&').split(' ');
Take a look at demo here: http://jsfiddle.net/M6fQ7/1/

Remove leading comma from a string

I have the following string:
",'first string','more','even more'"
I want to transform this into an Array but obviously this is not valid due to the first comma. How can I remove the first comma from my string and make it a valid Array?
I’d like to end up with something like this:
myArray = ['first string','more','even more']
To remove the first character you would use:
var myOriginalString = ",'first string','more','even more'";
var myString = myOriginalString.substring(1);
I'm not sure this will be the result you're looking for though because you will still need to split it to create an array with it. Maybe something like:
var myString = myOriginalString.substring(1);
var myArray = myString.split(',');
Keep in mind, the ' character will be a part of each string in the split here.
In this specific case (there is always a single character at the start you want to remove) you'll want:
str.substring(1)
However, if you want to be able to detect if the comma is there and remove it if it is, then something like:
if (str[0] == ',') {
str = str.substring(1);
}
One-liner
str = str.replace(/^,/, '');
I'll be back.
var s = ",'first string','more','even more'";
var array = s.split(',').slice(1);
That's assuming the string you begin with is in fact a String, like you said, and not an Array of strings.
Assuming the string is called myStr:
// Strip start and end quotation mark and possible initial comma
myStr=myStr.replace(/^,?'/,'').replace(/'$/,'');
// Split stripping quotations
myArray=myStr.split("','");
Note that if a string can be missing in the list without even having its quotation marks present and you want an empty spot in the corresponding location in the array, you'll need to write the splitting manually for a robust solution.
var s = ",'first string','more','even more'";
s.split(/'?,'?/).filter(function(v) { return v; });
Results in:
["first string", "more", "even more'"]
First split with commas possibly surrounded by single quotes,
then filter the non-truthy (empty) parts out.
To turn a string into an array I usually use split()
> var s = ",'first string','more','even more'"
> s.split("','")
[",'first string", "more", "even more'"]
This is almost what you want. Now you just have to strip the first two and the last character:
> s.slice(2, s.length-1)
"first string','more','even more"
> s.slice(2, s.length-2).split("','");
["first string", "more", "even more"]
To extract a substring from a string I usually use slice() but substr() and substring() also do the job.
s=s.substring(1);
I like to keep stuff simple.
You can use directly replace function on javascript with regex or define a help function as in php ltrim(left) and rtrim(right):
1) With replace:
var myArray = ",'first string','more','even more'".replace(/^\s+/, '').split(/'?,?'/);
2) Help functions:
if (!String.prototype.ltrim) String.prototype.ltrim = function() {
return this.replace(/^\s+/, '');
};
if (!String.prototype.rtrim) String.prototype.rtrim = function() {
return this.replace(/\s+$/, '');
};
var myArray = ",'first string','more','even more'".ltrim().split(/'?,?'/).filter(function(el) {return el.length != 0});;
You can do and other things to add parameter to the help function with what you want to replace the char, etc.
this will remove the trailing commas and spaces
var str = ",'first string','more','even more'";
var trim = str.replace(/(^\s*,)|(,\s*$)/g, '');
remove leading or trailing characters:
function trimLeadingTrailing(inputStr, toRemove) {
// use a regex to match toRemove at the start (^)
// and at the end ($) of inputStr
const re = new Regex(`/^${toRemove}|{toRemove}$/`);
return inputStr.replace(re, '');
}

Categories

Resources