convert combination of replace methods onto one - javascript

I have a combination of replace methods. How can I convert them to one:
.replace(/\s+/g, " ").replace(/\,|\?|\!|\:|\./g,'').replace("'", "_")
Is there any solution?

It's possible with a replacer function which alternates between the different possibilities, captures the matching subpattern, and checks which subpattern was matched in the replacer function, but it's really ugly. Your current solution is much easier to read.
const string = ' here is multiple spaces consolidated, punctuation removed!! and apostrophes don\'t exist! ';
const result = string
.replace(
/(\s+)|(\,|\?|\!|\:|\.)|(')/g,
(match, g1, g2, g3) => (
g1 ? ' ' :
g2 ? '' :
'_'
)
);
console.log(result);
I'd use your original version with a slight tweak: use a character set in the second replace instead, it'll be easier to read.
const string = ' here is multiple spaces consolidated, punctuation removed!! and apostrophes don\'t exist! ';
const result = string
.replace(/\s+/g, " ")
.replace(/[,?!:.]/g,'')
.replace("'", "_");
console.log(result);

Related

Convert pair of regex replacements with single replacement

I want to split a string on the first capital letter in a group.
For example, FooBARBaz should become Foo BAR Baz.
I've come up with:
str.replace(/[A-Z][a-z]+/g, ' $&')
.replace(/[A-Z]+/g, ' $&')
.replace(/\s+/g, ' ')
.trim();
Can anyone suggest a cleaner solution?
A simple regex like /([A-Z][a-z]+)/g already does it ... the trick is the capture group which gets preserved when being used with the split method. Thus one just needs to go for one group pattern (a single uppercase letter followed by at least one lower case letter) and does get the rest (a valid all uppercase letter group) for free.
And since the split array in addition contains empty string values one needs to run an extra filter task.
console.log(
'just split ...\n',
'"FooBARBazBizBuzBOOOOZBozzzBarFOOBARBazbizBUZBoooozBOZZZBar" =>',
'FooBARBazBizBuzBOOOOZBozzzBarFOOBARBazbizBUZBoooozBOZZZBar'
.split(/([A-Z][a-z]+)/g)
);
console.log(
'split and filter ...\n',
'"FooBARBazBizBuzBOOOOZBozzzBarFOOBARBazbizBUZBoooozBOZZZBar" =>',
'FooBARBazBizBuzBOOOOZBozzzBarFOOBARBazbizBUZBoooozBOZZZBar'
.split(/([A-Z][a-z]+)/g)
.filter(val => val !== '')
);
console.log(
'split, filter and join ...\n',
'"FooBARBazBizBuzBOOOOZBozzzBarFOOBARBazbizBUZBoooozBOZZZBar" =>\n',
'FooBARBazBizBuzBOOOOZBozzzBarFOOBARBazbizBUZBoooozBOZZZBar'
.split(/([A-Z][a-z]+)/g)
.filter(val => val !== '')
.join(' ')
);
.as-console-wrapper { min-height: 100%!important; top: 0; }
You can achieve this by using this replacer function inside the String.replace() method.
Live Demo :
const str = 'FooBARBaz';
function replacer() {
return ' ' + arguments[0] + ' ';
}
console.log(str.replace(/[A-Z][a-z]+/g, replacer).trim());
You can try to match upper, followed by upper or upper + not upper:
m = 'FooBARBazAB'.match(/([A-Z](?=[A-Z]|$))+|[A-Z][^A-Z]*/g)
console.log(m)
Not sure about how this is "cleaner" though.

RegEx Data Values Javascript white Space

I am trying to add the correct white space for data i am receiving. currently it shows like this
NotStarted
ReadyforPPPDReview
this is the code i am using
.replace(/([A-Z])/g, '$1')
"NotStarted" shows correct "Not Started" but "ReadyforPPPDReview" shows "Readyfor P P P D Review" when it should look like this "Ready for PPPD Review"
what is the best way to handle both of these using one regex or function?
You would need an NLP engine to handle this properly. Here are two approaches with simple regex, both have limitations:
1. Use list of stop words
We blindly add spaces before and after the stop words:
var str = 'NotStarted, ReadyforPPPDReview';
var wordList = 'and, for, in, on, not, review, the'; // stop words
var wordListRe = new RegExp('(' + wordList.replace(/, */g, '|') + ')', 'gi');
var result1 = str
.replace(wordListRe, ' $1 ') // add space before and after stop words
.replace(/([a-z])([A-Z])/g, '$1 $2') // add space between lower case and upper case chars
.replace(/ +/g, ' ') // remove excessive spaces
.trim(); // remove spaces at start and end
console.log('str: ' + str);
console.log('result1: ' + result1);
As you can imagine the stop words approach has some severe limitations. For example, words formula input would result in for mula in put.
1. Use a mapping table
The mapping table lists words that need to be spaced out (no drugs involved), as in this code snippet:
var str = 'NotStarted, ReadyforPPPDReview';
var spaceWordMap = {
NotStarted: 'Not Started',
Readyfor: 'Ready for',
PPPDReview: 'PPPD Review'
// add more as needed
};
var spaceWordMapRe = new RegExp('(' + Object.keys(spaceWordMap).join('|') + ')', 'gi');
var result2 = str
.replace(spaceWordMapRe, function(m, p1) { // m: matched snippet, p1: first group
return spaceWordMap[p1] // replace key in spaceWordMap with its value
})
.replace(/([a-z])([A-Z])/g, '$1 $2') // add space between lower case and upper case chars
.replace(/ +/g, ' ') // remove excessive spaces
.trim(); // remove spaces at start and end
console.log('str: ' + str);
console.log('result2: ' + result2);
This approach is suitable if you have a deterministic list of words as input.

Convert different strings to snake_case in Javascript

I know that we have a question similar to this but not quite the same.
I'm trying to make my function work which takes in a string as an argument and converts it to snake_case . It works most of the time with all the fancy !?<>= characters but there is one case that it can't convert and its camelCase .
It fails when I'm passing strings like snakeCase. It returns snakecase instead of snake_case.
I tried to implement it but I ended up just messing it up even more..
Can I have some help please?
my code:
const snakeCase = string => {
string = string.replace(/\W+/g, " ").toLowerCase().split(' ').join('_');
if (string.charAt(string.length - 1) === '_') {
return string.substring(0, string.length - 1);
}
return string;
}
You need to be able to detect the points at which an upper-case letter is in the string following another letter (that is, not following a space). You can do this with a regular expression, before you call toLowerCase on the input string:
\B(?=[A-Z])
In other words, a non-word boundary, followed by an upper case character. Split on either the above, or on a literal space, then .map the resulting array to lower case, and then you can join by underscores:
const snakeCase = string => {
return string.replace(/\W+/g, " ")
.split(/ |\B(?=[A-Z])/)
.map(word => word.toLowerCase())
.join('_');
};
console.log(snakeCase('snakeCase'));
Let's try that again Stan... this should do snake_case while realising that CamelCASECapitals = camel_case_capitals. It's basically the accepted answer with a pre-filter.
let splitCaps = string => string
.replace(/([a-z])([A-Z]+)/g, (m, s1, s2) => s1 + ' ' + s2)
.replace(/([A-Z])([A-Z]+)([^a-zA-Z0-9]*)$/, (m, s1, s2, s3) => s1 + s2.toLowerCase() + s3)
.replace(/([A-Z]+)([A-Z][a-z])/g,
(m, s1, s2) => s1.toLowerCase() + ' ' + s2);
let snakeCase = string =>
splitCaps(string)
.replace(/\W+/g, " ")
.split(/ |\B(?=[A-Z])/)
.map(word => word.toLowerCase())
.join('_');
> a = ['CamelCASERules', 'IndexID', 'CamelCASE', 'aID',
'theIDForUSGovAndDOD', 'TheID_', '_IDOne']
> _.map(a, snakeCase)
['camel_case_rules', 'index_id', 'camel_case', 'a_id', 'the_id_for_us_gov_and_dod',
'the_id_', '_id_one']
// And for the curious, here's the output from the pre-filter:
> _.map(a, splitCaps)
['Camel case Rules', 'Index Id', 'Camel Case', 'a Id', 'the id For us Gov And Dod',
'The Id_', '_id One']
Suppose the string is Hello World? and you want the returned value as hello_world? (with the character, then follow the below code)
const snakeCase = (string) => {
return string.replace(/\d+/g, ' ')
.split(/ |\B(?=[A-Z])/)
.map((word) => word.toLowerCase())
.join('_');
};
Example
snakeCase('Hello World?')
// "hello_world?"
snakeCase('Hello & World')
// "hello_&_world"
EDIT: It turns out this answer isn’t fool proof. The fool, being me ;-) Please check out a better one by Orwellophile here: https://stackoverflow.com/a/69878219/5377276
β€”β€”
I think this one should cover all the bases πŸ˜„
It was inspired by #h0r53's answer to the accepted answer. However it evolved into a more complete function, as it will convert any string, camelCase, kebab-case or otherwise into snake_case the way you'd expect it, containing only a-z and 0-9 characters, usable for function and variable names:
convert_to_snake_case(string) {
return string.charAt(0).toLowerCase() + string.slice(1) // lowercase the first character
.replace(/\W+/g, " ") // Remove all excess white space and replace & , . etc.
.replace(/([a-z])([A-Z])([a-z])/g, "$1 $2$3") // Put a space at the position of a camelCase -> camel Case
.split(/\B(?=[A-Z]{2,})/) // Now split the multi-uppercases customerID -> customer,ID
.join(' ') // And join back with spaces.
.split(' ') // Split all the spaces again, this time we're fully converted
.join('_') // And finally snake_case things up
.toLowerCase() // With a nice lower case
}
Conversion examples:
'snakeCase' => 'snake_case'
'CustomerID' => 'customer_id'
'GPS' => 'gps'
'IP-address' => 'ip_address'
'Another & Another, one too' => 'another_another_one_too'
'random ----- Thing123' => 'random_thing123'
'kebab-case-example' => 'kebab_case_example'
There is method in lodash named snakeCase(). You can consider that as well.
https://lodash.com/docs/4.17.15#snakeCase
Orwellophile's answer does not work for uppercase words delimited by a space:
E.g: 'TEST CASE' => t_e_s_t_case
The following solution does not break consecutive upper case characters and is a little shorter:
const snakeCase = str =>
str &&
str
.match(/[A-Z]{2,}(?=[A-Z][a-z]+[0-9]*|\b)|[A-Z]?[a-z]+[0-9]*|[A-Z]|[0-9]+/g)
.map(x => x.toLowerCase())
.join('_');
However, trailing underscores after uppercase words (examples from Orwellophile as well), do not work properly.
E.g: 'TheID_' => the_i_d
Taken from https://www.w3resource.com/javascript-exercises/fundamental/javascript-fundamental-exercise-120.php.

How to filter prefix and trim words from a string with JavaScript (not jQuery)

I have never been good with string manipulation and I am now stuck. So the long string has the following:
[desktop-detected display-settings main-boxed pace-done header-function-fixed nav-function-fixed nav-function-hidden nav-function-minify mod-sound mod-font]
I want to filter and keep the string with the nav-, header-, and mod-* prefixes, so the cleaned up string should look like:
[header-function-fixed nav-function-fixed nav-function-hidden nav-function-minify mod-sound mod-font]
I have no idea how to start, completely clueless...
split, filter, and join
var result = '[' + string.split(/[^\w-]+/).filter(function(item) {
return /^(nav|header|mod)-/i.test(item);
}).join(' ') + ']';
JSFiddle Demo: https://jsfiddle.net/c3nff3p6/2/
/[^\w-]+/ is splitting the phrase into an array by not matching words or dashes as the separator.
/^(nav|header|mod)-/i matches if the item starts with either of those values followed by dash, case insensitive.
Other solutions, thanks #Tushar.
'[' + string.slice(1, -1).split(/\s+/).filter(str => /^(nav|header|mod)-/.test(str)).join(' ') + ']'
using match
'[' + string.slice(1, -1).match(/\b(nav|header|mod)-\S*/g).join(' ') + ']'

single regex to capitalize first letter and replace dot

Trying out with a regex for simple problem. My input string is
firstname.ab
And am trying to output it as,
Firstname AB
So the main aim is to capitalize the first letter of the string and replace the dot with space. So chose to write two regex to solve.
First One : To replace dot with space /\./g
Second One : To capitalize the first letter /\b\w/g
And my question is, Can we do both operation with a single regex ?
Thanks in advance !!
You can use a callback function inside the replace:
var str = 'firstname.ab';
var result = str.replace(/^([a-zA-Z])(.*)\.([^.]+)$/, function (match, grp1, grp2, grp3, offset, s) {
return grp1.toUpperCase() + grp2 + " " + grp3.toUpperCase();
});
alert(result);
The grp1, grp2 and grp3 represent the capturing groups in the callback function. grp1 is a leading letter ([a-zA-Z]). Then we capturing any number of character other than newline ((.*) - if you have linebreaks, use [\s\S]*). And then comes the literal dot \. that we do not capture since we want to replace it with a space. And lastly, the ([^.]+$) regex will match and the capture all the remaining substring containing 1 or more characters other then a literal dot till the end.
We can use capturing groups to re-build the input string this way.
var $input = $('#input'),
value = $input.val(),
value = value.split( '.' );
value[0] = value[0].charAt( 0 ).toUpperCase() + value[0].substr(1),
value[1] = value[1].toUpperCase(),
value = value.join( ' ' );
$input.val( value );
It would be much easier if you simply split the value, process the string in the array, and join them back.
<script src="https://ajax.googleapis.com/ajax/libs/jquery/2.1.1/jquery.min.js"></script>
<input type="text" value="first.ab" id="input">

Categories

Resources