Convert pair of regex replacements with single replacement - javascript

I want to split a string on the first capital letter in a group.
For example, FooBARBaz should become Foo BAR Baz.
I've come up with:
str.replace(/[A-Z][a-z]+/g, ' $&')
.replace(/[A-Z]+/g, ' $&')
.replace(/\s+/g, ' ')
.trim();
Can anyone suggest a cleaner solution?

A simple regex like /([A-Z][a-z]+)/g already does it ... the trick is the capture group which gets preserved when being used with the split method. Thus one just needs to go for one group pattern (a single uppercase letter followed by at least one lower case letter) and does get the rest (a valid all uppercase letter group) for free.
And since the split array in addition contains empty string values one needs to run an extra filter task.
console.log(
'just split ...\n',
'"FooBARBazBizBuzBOOOOZBozzzBarFOOBARBazbizBUZBoooozBOZZZBar" =>',
'FooBARBazBizBuzBOOOOZBozzzBarFOOBARBazbizBUZBoooozBOZZZBar'
.split(/([A-Z][a-z]+)/g)
);
console.log(
'split and filter ...\n',
'"FooBARBazBizBuzBOOOOZBozzzBarFOOBARBazbizBUZBoooozBOZZZBar" =>',
'FooBARBazBizBuzBOOOOZBozzzBarFOOBARBazbizBUZBoooozBOZZZBar'
.split(/([A-Z][a-z]+)/g)
.filter(val => val !== '')
);
console.log(
'split, filter and join ...\n',
'"FooBARBazBizBuzBOOOOZBozzzBarFOOBARBazbizBUZBoooozBOZZZBar" =>\n',
'FooBARBazBizBuzBOOOOZBozzzBarFOOBARBazbizBUZBoooozBOZZZBar'
.split(/([A-Z][a-z]+)/g)
.filter(val => val !== '')
.join(' ')
);
.as-console-wrapper { min-height: 100%!important; top: 0; }

You can achieve this by using this replacer function inside the String.replace() method.
Live Demo :
const str = 'FooBARBaz';
function replacer() {
return ' ' + arguments[0] + ' ';
}
console.log(str.replace(/[A-Z][a-z]+/g, replacer).trim());

You can try to match upper, followed by upper or upper + not upper:
m = 'FooBARBazAB'.match(/([A-Z](?=[A-Z]|$))+|[A-Z][^A-Z]*/g)
console.log(m)
Not sure about how this is "cleaner" though.

Related

convert combination of replace methods onto one

I have a combination of replace methods. How can I convert them to one:
.replace(/\s+/g, " ").replace(/\,|\?|\!|\:|\./g,'').replace("'", "_")
Is there any solution?
It's possible with a replacer function which alternates between the different possibilities, captures the matching subpattern, and checks which subpattern was matched in the replacer function, but it's really ugly. Your current solution is much easier to read.
const string = ' here is multiple spaces consolidated, punctuation removed!! and apostrophes don\'t exist! ';
const result = string
.replace(
/(\s+)|(\,|\?|\!|\:|\.)|(')/g,
(match, g1, g2, g3) => (
g1 ? ' ' :
g2 ? '' :
'_'
)
);
console.log(result);
I'd use your original version with a slight tweak: use a character set in the second replace instead, it'll be easier to read.
const string = ' here is multiple spaces consolidated, punctuation removed!! and apostrophes don\'t exist! ';
const result = string
.replace(/\s+/g, " ")
.replace(/[,?!:.]/g,'')
.replace("'", "_");
console.log(result);

Masking phone number with regex in javascript

My application has a specific phone number format which looks like 999.111.222, which I have a regex pattern to mask it on front-end:
/[0-9]{3}\.[0-9]{3}\.([0-9]{3})/
But recently, the format was changed to allow the middle three digits to have one less digit, so now both 999.11.222 and 999.111.222 match. How can I change my regex accordingly?
"999.111.222".replace(/[0-9]{3}\.[0-9]{3}\.([0-9]{3})/, '<div>xxx.xxx.$1</div>')
expected output:
"999.111.222" // xxx.xxx.222
"999.11.222" // xxx.xx.222
Replace {3} with {2,3} to match two or three digits.
/[0-9]{3}\.[0-9]{2,3}\.([0-9]{3})/
For reference see e.g. MDN
Use
console.log(
"999.11.222".replace(/[0-9]{3}\.([0-9]{2,3})\.([0-9]{3})/, function ($0, $1, $2)
{ return '<div>xxx.' + $1.replace(/\d/g, 'x') + '.' + $2 + '</div>'; })
)
The ([0-9]{2,3}) first capturing group will match 2 or 3 digits, and in the callback method used as the replacement argument, all the digits from th first group are replaced with x.
You may further customize the pattern for the first set of digits, too.
In fact, you should change not only your regex but also your callback replace function:
const regex = /[0-9]{3}\.([0-9]{2,3})\.([0-9]{3})/;
const cbFn = (all, g1, g2) =>`<div>xxx.xx${(g1.length === 3 ? 'x' : '')}.${g2}</div>`;
const a = "999.11.222".replace(regex, cbFn);
const b = "999.111.222".replace(regex, cbFn);
console.log(a, b);
To change regex you could add a term with {2,3} quantifier, as already suggested, and create a new group. Then, in replace cb function, you can use length to know if you must put a new x.

Convert different strings to snake_case in Javascript

I know that we have a question similar to this but not quite the same.
I'm trying to make my function work which takes in a string as an argument and converts it to snake_case . It works most of the time with all the fancy !?<>= characters but there is one case that it can't convert and its camelCase .
It fails when I'm passing strings like snakeCase. It returns snakecase instead of snake_case.
I tried to implement it but I ended up just messing it up even more..
Can I have some help please?
my code:
const snakeCase = string => {
string = string.replace(/\W+/g, " ").toLowerCase().split(' ').join('_');
if (string.charAt(string.length - 1) === '_') {
return string.substring(0, string.length - 1);
}
return string;
}
You need to be able to detect the points at which an upper-case letter is in the string following another letter (that is, not following a space). You can do this with a regular expression, before you call toLowerCase on the input string:
\B(?=[A-Z])
In other words, a non-word boundary, followed by an upper case character. Split on either the above, or on a literal space, then .map the resulting array to lower case, and then you can join by underscores:
const snakeCase = string => {
return string.replace(/\W+/g, " ")
.split(/ |\B(?=[A-Z])/)
.map(word => word.toLowerCase())
.join('_');
};
console.log(snakeCase('snakeCase'));
Let's try that again Stan... this should do snake_case while realising that CamelCASECapitals = camel_case_capitals. It's basically the accepted answer with a pre-filter.
let splitCaps = string => string
.replace(/([a-z])([A-Z]+)/g, (m, s1, s2) => s1 + ' ' + s2)
.replace(/([A-Z])([A-Z]+)([^a-zA-Z0-9]*)$/, (m, s1, s2, s3) => s1 + s2.toLowerCase() + s3)
.replace(/([A-Z]+)([A-Z][a-z])/g,
(m, s1, s2) => s1.toLowerCase() + ' ' + s2);
let snakeCase = string =>
splitCaps(string)
.replace(/\W+/g, " ")
.split(/ |\B(?=[A-Z])/)
.map(word => word.toLowerCase())
.join('_');
> a = ['CamelCASERules', 'IndexID', 'CamelCASE', 'aID',
'theIDForUSGovAndDOD', 'TheID_', '_IDOne']
> _.map(a, snakeCase)
['camel_case_rules', 'index_id', 'camel_case', 'a_id', 'the_id_for_us_gov_and_dod',
'the_id_', '_id_one']
// And for the curious, here's the output from the pre-filter:
> _.map(a, splitCaps)
['Camel case Rules', 'Index Id', 'Camel Case', 'a Id', 'the id For us Gov And Dod',
'The Id_', '_id One']
Suppose the string is Hello World? and you want the returned value as hello_world? (with the character, then follow the below code)
const snakeCase = (string) => {
return string.replace(/\d+/g, ' ')
.split(/ |\B(?=[A-Z])/)
.map((word) => word.toLowerCase())
.join('_');
};
Example
snakeCase('Hello World?')
// "hello_world?"
snakeCase('Hello & World')
// "hello_&_world"
EDIT: It turns out this answer isn’t fool proof. The fool, being me ;-) Please check out a better one by Orwellophile here: https://stackoverflow.com/a/69878219/5377276
β€”β€”
I think this one should cover all the bases πŸ˜„
It was inspired by #h0r53's answer to the accepted answer. However it evolved into a more complete function, as it will convert any string, camelCase, kebab-case or otherwise into snake_case the way you'd expect it, containing only a-z and 0-9 characters, usable for function and variable names:
convert_to_snake_case(string) {
return string.charAt(0).toLowerCase() + string.slice(1) // lowercase the first character
.replace(/\W+/g, " ") // Remove all excess white space and replace & , . etc.
.replace(/([a-z])([A-Z])([a-z])/g, "$1 $2$3") // Put a space at the position of a camelCase -> camel Case
.split(/\B(?=[A-Z]{2,})/) // Now split the multi-uppercases customerID -> customer,ID
.join(' ') // And join back with spaces.
.split(' ') // Split all the spaces again, this time we're fully converted
.join('_') // And finally snake_case things up
.toLowerCase() // With a nice lower case
}
Conversion examples:
'snakeCase' => 'snake_case'
'CustomerID' => 'customer_id'
'GPS' => 'gps'
'IP-address' => 'ip_address'
'Another & Another, one too' => 'another_another_one_too'
'random ----- Thing123' => 'random_thing123'
'kebab-case-example' => 'kebab_case_example'
There is method in lodash named snakeCase(). You can consider that as well.
https://lodash.com/docs/4.17.15#snakeCase
Orwellophile's answer does not work for uppercase words delimited by a space:
E.g: 'TEST CASE' => t_e_s_t_case
The following solution does not break consecutive upper case characters and is a little shorter:
const snakeCase = str =>
str &&
str
.match(/[A-Z]{2,}(?=[A-Z][a-z]+[0-9]*|\b)|[A-Z]?[a-z]+[0-9]*|[A-Z]|[0-9]+/g)
.map(x => x.toLowerCase())
.join('_');
However, trailing underscores after uppercase words (examples from Orwellophile as well), do not work properly.
E.g: 'TheID_' => the_i_d
Taken from https://www.w3resource.com/javascript-exercises/fundamental/javascript-fundamental-exercise-120.php.

Don't understand why replace() method is not working as expected

Working on the following problem:
Create a function called alienLanguage where the input will be a str and the output should capitalize all letters except for the last letter of each word
alienLanguage("My name is John") should return "My NAMe Is JOHn"
This is what I have coded:
function alienLanguage(str){
var words = str.toUpperCase().split(' ').map(function (a) {
return a.replace(a[a.length - 1], a[a.length - 1].toLowerCase())
});
return words.join(' ');
}
All of the example test cases work except for the following:
Expected: '\'THIs Is An EXAMPLe\'', instead got: '\'THIs Is An eXAMPLE\''
Why is the e in eXAMPLE turning lowercase? Shouldn't everything automatically turn upperCase ?
edit: I just realized that there are 2 e's and the first one is being replaced. How can I replace the last e? and isn't the last character specified already?
isn't the last character specified already?
No, you've only specified what character to replace, not where. replace searches the string for the expression.
How can I replace the last e?
Don't use replace at all. It does construct a new string anyway, and you can do that much easier:
function alienLanguage(str) {
return str.split(' ').map(function(a) {
return a.slice(0, -1).toUpperCase() + a.slice(-1).toLowerCase();
}).join(' ');
}
You also could use replace, but would use it with regular expression:
function alienLanguage(str) {
return str.toUpperCase().replace(/.\b/g, function(last) {
// ^^^^^^ matches all characters before a word boundary
return last.toLowerCase();
});
}
You don't need to split the string into words. Just use a positive lookahead assertion in your regex to upper-case all letters that are immediately followed by another letter, like this:
function alienLanguage(str){
return str.replace(/(\w)(?=\w)/g, l => l.toUpperCase())
}
You can use substring. Your replace replaces and replaces the first occurrence of the last character in the words -- definitely what you're after:
function alienLanguage(str){
var words = str.toUpperCase().split(' ').map(function (a) {
return a.length ? a.substring(0, a.length - 2) + a[a.length - 1].toLowerCase() : a;
});
return words.join(' ');
}
As #TatsuyukiIshi said, you need to change the use of replace() for regular assignment. For extra help, here is an updated version of your alienLanguage function:
function alienLanguage(str){
return str.toUpperCase().split(' ').map(function(word) {
return word.slice(0, -1) + word.substr(-1).toLowerCase()
}).join(' ')
}
Fiddle here
Extra notes:
word.slice(0,-1) returns the word minus the last character.
word.substr(-1) returns the last character as a string.
.map() returns an array, so you can call join() directly on the result.
replace() is searching using a pattern (or regular expression) and it replaces all occurrences.
Instead, you want to truncate and create a new string, which is known as assignment in other languages (but strings in JS are immutable). See here for more details.
Try it.
function alienLanguage(str){
var words = str.toUpperCase().split(' ').map(function (a) {
return a.substr(0, a.length - 1) +a[a.length - 1].toLowerCase() + a.substr(a.length - 1 + 1);
});
return words.join(' ');
}
Instead of searching for the letter, just concatenate the uppercase part of the string up to the last letter with the lowercased last letter.
This should do the trick:
function alienLanguage(str) {
return str.split(' ').map(function(a) {
return a.slice(0, -1).toUpperCase() + a.slice(-1).toLowerCase();
}).join(' ');
}

regex replace string with dollar sign, Weird output

I am using a javascript function as follows to convert an html blockquote to a markdown blockquote:
function convertBlockquote(str) {
var r = str;
var pat = /<blockquote>\n?([\s\S]*?)\n?<\/blockquote>/mi; //[\s\S] = dotall; ? = non-greedy match
for (var mat; (mat = r.match(pat)) !== null; ) {
mat = mat[1]
.replace(/\n/gm, '\n> ')
.replace(/<p>/igm, '\n> ')
.replace(/<\/p>/igm, '\n> \n> ')
.replace(/(\n> ?){3,}/gm, '\n> \n> ');
r = r.replace(pat, '\n>' + mat + '\n');
}
return r;
}
So if I pass in: <blockquote>Price: $1000 plus tax.</blockquote>
I would expect: > Price: $1000 plus tax.
But I get: > Price: Price: $1000 plus tax.000 plus tax.
Notice how it is replacing $1 in the $1000 part of the string with the entire original string?
How can I escape this or update the function to handle this (and other special chars which might cause similar issue)?
https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/String/replace:
You can specify a function as the second parameter. In this case, the function will be invoked after the match has been performed. The function's result (return value) will be used as the replacement string. (Note: the above-mentioned special replacement patterns do not apply in this case.)
So you can write
r = r.replace(pat, function () { return '\n>' + mat + '\n'; });
Am I missing something obvious or the implementation looks complicated?
That's how I'd do that:
str=str.replace(/<blockquote>(.*?)<\/blockquote>/g,'> $1')
In case you'd like the multiline quotes to be matched:
str=str.replace(/<blockquote>((.|\n)*?)<\/blockquote>/g,'> $1')

Categories

Resources