How do I remove duplicates of two different character pairs, in JavaScript? - javascript

I have an input field where the user is only allowed to use letters, spaces and commas.
I've created this here so far:
splitStr = splitStr.split(' ').join(', ');
splitStr = splitStr.split(',').join(', ');
splitStr = splitStr.split(';').join(', ');
splitStr = splitStr.split('-').join(', ');
splitStr = splitStr.split('_').join(', ');
splitStr = splitStr.split('/').join(', ');
splitStr = splitStr.split('#').join(', ');
$("imgTags").value = splitStr;
// removes duplicate spaces
splitStr = splitStr.replace(/ +(?= )/g,'');
// removes duplicate commas
splitStr = splitStr.replace(/,+/g,',');
// missing: remove ', ' duplicates
So this code above makes it so that the users input is always converted to a comma space and on the bottom of the code I'm removing artifacts that can happen, like duplicate commas or duplicate spaces.
In the first like you can see that I'm also replacing any space with comma space.. this gives me an artifact of , , , , , , , , , , , , , , , this means I need also to replaces any comma space comma space with just a single comma space, so I've tried to do this but I never get the the desired result.
How can I replace regex for space comma duplicates?
This: , , , , , , needs to become this , e.g. comma space comma space comma space needs to be just comma space.

very easy, something like this
str.replace(/(, )+/g, ", ")
or even in the very beginning
str.replace(/[-_;, ]+/g, ", ")

Split string by non-letters and join with commas:
str.split(/[^A-Za-z]/).join(',')
replace duplicate commas:
str.replace(/,+/g,',');
replace comma with comma space
str.replace(/,/g,', ');

You could do all those replacements in one go using a repeated character class and replace the matches with a comma and a space.
Because the character class is repeated, it will match consecutive matches and use only a single replacement.
[ ,;_\/#-]+
Regex demo
const regex = /[ ,;_\/#-]+/g;
const str = `test,,,,,test; test-test/test`;
const subst = `, `;
const result = str.replace(regex, subst);
console.log(result);
If you don't want to replace when the characters are at the start or end of the line, you might use a callback function for the replacement:
let pattern = /(?:^[ ,;_\/#-]+|[ ,;_\/#-]+$)|([ ,;_\/#-]+)/g;
let str = "/#,test,,,test; test-test/test,#/";
str = str.replace(pattern, function(m, g1) {
if (g1 !== undefined) {
return ', ';
}
return m;
});
console.log(str);

Related

Replace Regex Symbols + Blank Space

There is any way to make a regex to replace symbols + blank space?
Im using:
const cleanMask = (value) => {
const output = value.replace(/[_()-]/g, "").trim();
return output;
}
let result = cleanMask('this (contains parens) and_underscore, and-dash')
console.log(result)
Its it right?
Your current code will replace all occurrences of characters _, (, ) and - with an empty string and then trim() whitespace from the beginning and end of the result.
If you want to remove ALL whitespace, you can use the whitespace character class \s instead of trim() like this:
const output = value.replace(/[_()-\s]/g, "");

JS : Remove all strings which are starting with specific character

I have an array contains names. Some of them starting with a dot (.), and some of them have dot in the middle or elsewhere. I need to remove all names only starting with dot. I seek help for a better way in JavaScript.
var myarr = 'ad, ghost, hg, .hi, jk, find.jpg, dam.ark, haji, jive.pdf, .find, home, .war, .milk, raj, .ker';
var refinedArr = ??
You can use the filter function and you can access the first letter of every word using item[0]. You do need to split the string first.
var myarr = 'ad, ghost, hg, .hi, jk, find.jpg, dam.ark, haji, jive.pdf, .find, home, .war, .milk, raj, .ker'.split(", ");
var refinedArr = myarr.filter(function(item) {
return item[0] != "."
});
console.log(refinedArr)
Use filter and startsWith:
let myarr = ['ad', 'ghost', 'hg', '.hi', 'jk'];
let res = myarr.filter(e => ! e.startsWith('.'));
console.log(res);
You can use the RegEx \B\.\w+,? ? and replace with an empty String.
\B matches a non word char
\. matches a dot
\w+ matches one or more word char
,? matches 0 or 1 ,
[space]? matches 0 or 1 [space]
Demo:
const regex = /\B\.\w+,? ?/g;
const str = `ad, ghost, hg, .hi, jk, find.jpg, dam.ark, haji, jive.pdf, .find, home, .war, .milk, raj, .ker`;
const subst = ``;
// The substituted value will be contained in the result variable
const result = str.replace(regex, subst);
console.log('Substitution result: ', result);

Split string by all spaces except those in parentheses

I'm trying to split text the following like on spaces:
var line = "Text (what is)|what's a story|fable called|named|about {Search}|{Title}"
but I want it to ignore the spaces within parentheses. This should produce an array with:
var words = ["Text", "(what is)|what's", "a", "story|fable" "called|named|about", "{Search}|{Title}"];
I know this should involve some sort of regex with line.match(). Bonus points if the regex removes the parentheses. I know that word.replace() would get rid of them in a subsequent step.
Use the following approach with specific regex pattern(based on negative lookahead assertion):
var line = "Text (what is)|what's a story|fable called|named|about {Search}|{Title}",
words = line.split(/(?!\(.*)\s(?![^(]*?\))/g);
console.log(words);
(?!\(.*) ensures that a separator \s is not preceded by brace ((including attendant characters)
(?![^(]*?\)) ensures that a separator \s is not followed by brace )(including attendant characters)
Not a single regexp but does the job. Removes the parentheses and splits the text by spaces.
var words = line.replace(/[\(\)]/g,'').split(" ");
One approach which is useful in some cases is to replace spaces inside parens with a placeholder, then split, then unreplace:
var line = "Text (what is)|what's a story|fable called|named|about {Search}|{Title}";
var result = line.replace(/\((.*?)\)/g, m => m.replace(' ', 'SPACE'))
.split(' ')
.map(x => x.replace(/SPACE/g, ' '));
console.log(result);

Separating words with Regex

I am trying to get this result: 'Summer-is-here'. Why does the code below generate extra spaces? (Current result: '-Summer--Is- -Here-').
function spinalCase(str) {
var newA = str.split(/([A-Z][a-z]*)/).join("-");
return newA;
}
spinalCase("SummerIs Here");
You are using a variety of split where the regexp contains a capturing group (inside parentheses), which has a specific meaning, namely to include all the splitting strings in the result. So your result becomes:
["", "Summer", "", "Is", " ", "Here", ""]
Joining that with - gives you the result you see. But you can't just remove the unnecessary capture group from the regexp, because then the split would give you
["", "", " ", ""]
because you are splitting on zero-width strings, due to the * in your regexp. So this doesn't really work.
If you want to use split, try splitting on zero-width or space-only matches looking ahead to a uppercase letter:
> "SummerIs Here".split(/\s*(?=[A-Z])/)
^^^^^^^^^ LOOK-AHEAD
< ["Summer", "Is", "Here"]
Now you can join that to get the result you want, but without the lowercase mapping, which you could do with:
"SummerIs Here" .
split(/\s*(?=[A-Z])/) .
map(function(elt, i) { return i ? elt.toLowerCase() : elt; }) .
join('-')
which gives you want you want.
Using replace as suggested in another answer is also a perfectly viable solution. In terms of best practices, consider the following code from Ember:
var DECAMELIZE_REGEXP = /([a-z\d])([A-Z])/g;
var DASHERIZE_REGEXP = /[ _]/g;
function decamelize(str) {
return str.replace(DECAMELIZE_REGEXP, '$1_$2').toLowerCase();
}
function dasherize(str) {
return decamelize(str).replace(DASHERIZE_REGEXP, '-');
}
First, decamelize puts an underscore _ in between two-character sequences of lower-case letter (or digit) and upper-case letter. Then, dasherize replaces the underscore with a dash. This works perfectly except that it lower-cases the first word in the string. You can sort of combine decamelize and dasherize here with
var SPINALIZE_REGEXP = /([a-z\d])\s*([A-Z])/g;
function spinalCase(str) {
return str.replace(SPINALIZE_REGEXP, '$1-$2').toLowerCase();
}
You want to separate capitalized words, but you are trying to split the string on capitalized words that's why you get those empty strings and spaces.
I think you are looking for this :
var newA = str.match(/[A-Z][a-z]*/g).join("-");
([A-Z][a-z]*) *(?!$|[a-z])
You can simply do a replace by $1-.See demo.
https://regex101.com/r/nL7aZ2/1
var re = /([A-Z][a-z]*) *(?!$|[a-z])/g;
var str = 'SummerIs Here';
var subst = '$1-';
var result = str.replace(re, subst);
var newA = str.split(/ |(?=[A-Z])/).join("-");
You can change the regex like:
/ |(?=[A-Z])/ or /\s*(?=[A-Z])/
Result:
Summer-Is-Here

How to Split string with multiple rules in javascript

I have this string for example:
str = "my name is john#doe oh.yeh";
the end result I am seeking is this Array:
strArr = ['my','name','is','john','&#doe','oh','&yeh'];
which means 2 rules apply:
split after each space " " (I know how)
if there are special characters ("." or "#") then also split but add the characther "&" before the word with the special character.
I know I can strArr = str.split(" ") for the first rule. but how do I do the other trick?
thanks,
Alon
Assuming the result should be '&doe' and not '&#doe', a simple solution would be to just replace all . and # with & split by spaces:
strArr = str.replace(/[.#]/g, ' &').split(/\s+/)
/\s+/ matches consecutive white spaces instead of just one.
If the result should be '&#doe' and '&.yeah' use the same regex and add a capture:
strArr = str.replace(/([.#])/g, ' &$1').split(/\s+/)
You have to use a Regular expression, to match all special characters at once. By "special", I assume that you mean "no letters".
var pattern = /([^ a-z]?)[a-z]+/gi; // Pattern
var str = "my name is john#doe oh.yeh"; // Input string
var strArr = [], match; // output array, temporary var
while ((match = pattern.exec(str)) !== null) { // <-- For each match
strArr.push( (match[1]?'&':'') + match[0]); // <-- Add to array
}
// strArr is now:
// strArr = ['my', 'name', 'is', 'john', '&#doe', 'oh', '&.yeh']
It does not match consecutive special characters. The pattern has to be modified for that. Eg, if you want to include all consecutive characters, use ([^ a-z]+?).
Also, it does nothing include a last special character. If you want to include this one as well, use [a-z]* and remove !== null.
use split() method. That's what you need:
http://www.w3schools.com/jsref/jsref_split.asp
Ok. i saw, you found it, i think:
1) first use split to the whitespaces
2) iterate through your array, split again in array members when you find # or .
3) iterate through your array again and str.replace("#", "&#") and str.replace(".","&.") when you find
I would think a combination of split() and replace() is what you are looking for:
str = "my name is john#doe oh.yeh";
strArr = str.replace('\W',' &');
strArr = strArr.split(' ');
That should be close to what you asked for.
This works:
array = string.replace(/#|\./g, ' &$&').split(' ');
Take a look at demo here: http://jsfiddle.net/M6fQ7/1/

Categories

Resources