Is there a python strip function equivalent in javascript? - javascript

Python's strip function is used to remove given characters from the beginning and end of the string. How to create a similar function in javascript?
Example:
str = "'^$ *# smart kitty & ''^$* '^";
newStr = str.strip(" '^$*#&");
console.log(newStr);
Output:
smart kitty

There's lodash's trim()
Removes leading and trailing whitespace or specified characters from string.
_.trim(' abc '); // → 'abc'
_.trim('-_-abc-_-', '_-'); // → 'abc'

A simple but not very effective way would be to look for the characters and remove them:
function strip(str, remove) {
while (str.length > 0 && remove.indexOf(str.charAt(0)) != -1) {
str = str.substr(1);
}
while (str.length > 0 && remove.indexOf(str.charAt(str.length - 1)) != -1) {
str = str.substr(0, str.length - 1);
}
return str;
}
A more effective, but not as easy to use, would be a regular expression:
str = str.replace(/(^[ '\^\$\*#&]+)|([ '\^\$\*#&]+$)/g, '')
Note: I escaped all characters that have any special meaning in a regular expression. You need to do that for some characters, but perhaps not all the ones that I escaped here as they are used inside a set. That's mostly to point out that some characters do need escaping.

Modifying a code snippet from Mozilla Developer Network String.prototype.trim(), you could define such a function as follows.
if (!String.prototype.strip) {
String.prototype.strip = function (string) {
var escaped = string.replace(/([.*+?^=!:${}()|\[\]\/\\])/g, "\\$1");
return this.replace(RegExp("^[" + escaped + "]+|[" + escaped + "]+$", "gm"), '');
};
}
It's not necessary and probably not advisable to put this function in the object String.prototype, but it does give you a clear indication of how such a function compares with the existing String.prototype.trim().
The value of escaped is as in the function escapeRegExp in the guide to Regular Expressions. The Java programming language has a standard library function for that purpose, but JavaScript does not.

Not exactly... I would use regex for complicated string manipulation or the Slice() method to remove characters at certain points
Slice() explained

Related

Regex split comma except escaped [duplicate]

I have this string:
a\,bcde,fgh,ijk\,lmno,pqrst\,uv
I need a JavaScript function that will split the string by every , but only those that don't have a \ before them
How can this be done?
Here's the shortest thing I could come up with:
'a\\,bcde,fgh,ijk\\,lmno,pqrst\\,uv'.replace(/([^\\]),/g, '$1\u000B').split('\u000B')
The idea behind is to find every place where comma isn't prefixed with a backslash, replace those with string that is uncommon to come up in your strings and then split by that uncommon string.
Note that backslashes before commas have to be escaped using another backslash. Otherwise, javascript treats form \, as escaped comma and produce simply a comma out of it! In other words if you won't escape the backslash, javascript sees this: a\,bcde,fgh,ijk\,lmno,pqrst\,uv as this a,bcde,fgh,ijk,lmno,pqrst,uv.
Since regular expressions in JavaScript does not support lookbehinds, I'm not going to cook up a giant hack to mimic this behavior. Instead, you can just split() on all commas (,) and then glue back the pieces that shouldn't have been split in the first place.
Quick 'n' dirty demo:
var str = 'a\\,bcde,fgh,ijk\\,lmno,pqrst\\,uv'.split(','), // Split on all commas
out = []; // Output
for (var i = 0, j = str.length - 1; i < j; i++) { // Iterate all but last (last can never be glued to non-existing next)
var curr = str[i]; // This piece
if (curr.charAt(curr.length - 1) == '\\') { // If ends with \ ...
curr += ',' + str[++i]; // ... glue with next and skip next (increment i)
}
out.push(curr); // Add to output
}
Another ugly hack around the lack of look-behinds:
function rev(s) {
return s.split('').reverse().join('');
}
var s = 'a\\,bcde,fgh,ijk\\,lmno,pqrst\\,uv';
// Enter bizarro world...
var r = rev(s);
// Split with a look-ahead
var rparts = r.split(/,(?!\\)/);
// And put it back together with double reversing.
var sparts = [ ];
while(rparts.length)
sparts.push(rev(rparts.pop()));
for(var i = 0; i < sparts.length; ++i)
$('#out').append('<pre>' + sparts[i] + '</pre>');
Demo: http://jsfiddle.net/ambiguous/QbBfw/1/
I don't think I'd do this in real life but it works even if it does make me feel dirty. Consider this a curiosity rather than something you should really use.
In case if need remove backslashes also:
var test='a\\.b.c';
var result = test.replace(/\\?\./g, function (t) { return t == '.' ? '\u000B' : '.'; }).split('\u000B');
//result: ["a.b", "c"]
In 2022 most of browsers support lookbehinds:
https://caniuse.com/js-regexp-lookbehind
Safari should be your only concern.
With a lookbehind you can split your string this way:
"a\\,bcde,fgh,ijk\\,lmno,pqrst\\,uv".split(/(?<!\\),/)
// => ['a\\,bcde', 'fgh', 'ijk\\,lmno', 'pqrst\\,uv']
You can use regex to do the split.
Here is the link to regex in javascript http://www.w3schools.com/jsref/jsref_obj_regexp.asp
Here is the link to other post where the author have used regex for split Javascript won't split using regex
From the first link if you note you can create a regular expression using
?!n Matches any string that is not followed by a specific string n
[,]!\\

Split string in JavaScript using regex with zero width lookbehind

I know JavaScript regular expressions have native lookaheads but not lookbehinds.
I want to split a string at points either beginning with any member of one set of characters or ending with any member of another set of characters.
Split before ເ, ແ, ໂ, ໃ, ໄ. Split after ະ.
In: ເລື້ອຍໆມະຫັດສະຈັນເອກອັກຄະລັດຖະທູດ
Out: ເລື້ອຍໆມະ ຫັດສະ ຈັນ ເອກອັກຄະ ລັດຖະ ທູດ
I can achieve the "split before" part using zero-width lookahead:
'ເລື້ອຍໆມະຫັດສະຈັນເອກອັກຄະລັດຖະທູດ'.split(/(?=[ໃໄໂເແ])/)
["ເລື້ອຍໆມະຫັດສະຈັນ", "ເອກອັກຄະລັດຖະທູດ"]
But I can't think of a general approach to simulating zero-width lookbehind
I'm splitting strings of arbitrary Unicode text so don't want to substitute in special markers in a first pass, since I can't guarantee the absence of any string from my input.
Instead of spliting, you may consider using the match() method.
var s = 'ເລື້ອຍໆມະຫັດສະຈັນເອກອັກຄະລັດຖະທູດ',
r = s.match(/(?:(?!ະ).)+?(?:ະ|(?=[ໃໄໂເແ]|$))/g);
console.log(r); //=> [ 'ເລື້ອຍໆມະ', 'ຫັດສະ', 'ຈັນ', 'ເອກອັກຄະ', 'ລັດຖະ', 'ທູດ' ]
You could try matching rather than splitting,
> var re = /((?:(?!ະ).)+(?:ະ|$))/g;
undefined
> var str = "ເລື້ອຍໆມະຫັດສະຈັນເອກອັກຄະລັດຖະທູດ"
undefined
> var m;
undefined
> while ((m = re.exec(str)) != null) {
... console.log(m[1]);
... }
ເລື້ອຍໆມະ
ຫັດສະ
ຈັນເອກອັກຄະ
ລັດຖະ
ທູດ
Then again split the elements in the array using lookahead.
If you use parentheses in the delimited regex, the captured text is included in the returned array. So you can just split on /(ະ)/ and then concatenate each of the odd members of the resulting array to the preceding even member. Example:
"ເລື້ອຍໆມະຫັດສະຈັນເອກອັກຄະລັດຖະທູ".split(/(ະ)/).reduce(function(arr,str,index) {
if (index%2 == 0) {
arr.push(str);
} else {
arr[arr.length-1] += str
};
return arr;
},[])
Result: ["ເລື້ອຍໆມະ", "ຫັດສະ", "ຈັນເອກອັກຄະ", "ລັດຖະ", "ທູ"]
You can do another pass to split on the lookahead:
"ເລື້ອຍໆມະຫັດສະຈັນເອກອັກຄະລັດຖະທູ".split(/(ະ)/).reduce(function(arr,str,index) {
if (index%2 == 0) {
arr.push(str);
} else {
arr[arr.length-1] += str
};
return arr;
},[]).reduce(function(arr,str){return arr.concat(str.split(/(?=[ໃໄໂເແ])/));},[]);
Result: ["ເລື້ອຍໆມະ", "ຫັດສະ", "ຈັນ", "ເອກອັກຄະ", "ລັດຖະ", "ທູ"]

match word not capitalized a certain way

I want a regular expression that matches all instances of "capitalizedExactlyThisWay" that are not capitalizedExactlyThisWay.
I created a function that finds the indexes of all case insensitive matches and then pushes the values back in like this (JSBIN)
But I would rather just say something like text.replace(regexp,"<highlight>$1</highlight>");
replace has a callback function too.
s = s.replace(reg1, function(m){
if(m===word) return m;
return '<highlight>'+m+'</highlight>';
});
Unfortunately JavaScript regular expressions do not support making only a part of the expression case-insensitive.
You could write a little helper function that does the dirty work:
function capitalizationSensitiveRegex(word) {
var chars = word.split(""), i;
for (i = 0; i < chars.length; i++) {
chars[i] = "[" + chars[i].toLowerCase() + chars[i].toUpperCase() + "]";
}
return new RegExp("(?=\\b" + chars.join("") + "\\b)(?!" + word + ").{" + word.length + "}", "g");
}
Result:
capitalizationSensitiveRegex("capitalizedExactlyThisWay");
=> /(?=\b[cC][aA][pP][iI][tT][aA][lL][iI][zZ][eE][dD][eE][xX][aA][cC][tT][lL][yY][tT][hH][iI][sS][wW][aA][yY]\b)(?!capitalizedExactlyThisWay).{25}/g
Note that this assumes ASCII letters due to limitations of how \b works in JavaScript. It also assumes you're not using any regex meta characters in word (brackets, backslashes, parentheses, stars, dots, etc). An extra step of regex-quoting each char is necessary to make the above stable.
You can use match and map method with a callback:
tok=[], input.match(/\bcapitalizedexactlythisway\b/ig).map( function (m) {
if (m!="capitalizedExactlyThisWay") tok.push(m); });
console.log( tok );
["capitalizedEXACTLYTHISWAY", "capitalizedexactlYthisWay", "capitalizedexactlythisway"]
You could try this regex to match all the case-insensitive exactlythisway string but not of ExactlyThisWay ,
\bcapitalized(?!ExactlyThisWay)(?:[Ee][Xx][Aa][Cc][Tt][Ll][Yy][Tt][Hh][Ii][Ss][Ww][Aa][Yy])\b
Demo
If you could somehow get JavaScript to work with partial case-insensitive matching, i.e. (?i), you could use the following expression:
capitalized(?!ExactlyThisWay)(?i)exactlythisway
If not, you're probably stuck with something like this:
capitalized(?!ExactlyThisWay)[a-zA-Z]+
The downside is that it will also match other variations such as capitalizedfoobar etc.
Demo

RegEx needed to split javascript string on "|" but not "\|"

We would like to split a string on instances of the pipe character |, but not if that character is preceded by an escape character, e.g. \|.
ex we would like to see the following string split into the following components
1|2|3\|4|5
1
2
3\|4
5
I'm expecting to be able to use the following javascript function, split, which takes a regular expression. What regex would I pass to split? We are cross platform and would like to support current and previous versions (1 version back) of IE, FF, and Chrome if possible.
Instead of a split, do a global match (the same way a lexical analyzer would):
match anything other than \\ or |
or match any escaped char
Something like this:
var str = "1|2|3\\|4|5";
var matches = str.match(/([^\\|]|\\.)+/g);
A quick explanation: ([^\\|]|\\.) matches either any character except '\' and '|' (pattern: [^\\|]) or (pattern: |) it matches any escaped character (pattern: \\.). The + after it tells it to match the previous once or more: the pattern ([^\\|]|\\.) will therefor be matches once or more. The g at the end of the regex literal tells the JavaScript regex engine to match the pattern globally instead of matching it just once.
What you're looking for is a "negative look-behind matching regular expression".
This isn't pretty, but it should split the list for you:
var output = input.replace(/(\\)?|/g, function($0,$1){ return $1?$1:$0+'\n';});
This will take your input string and replace all of the '|' characters NOT immediately preceded by a '\' character and replace them with '\n' characters.
A regex solution was posted as I was looking into this. So I just went ahead and wrote one without it. I did some simple benchmarks and it is -slightly- faster (I expected it to be slower...).
Without using Regex, if I understood what you desire, this should do the job:
function doSplit(input) {
var output = [];
var currPos = 0,
prevPos = -1;
while ((currPos = input.indexOf('|', currPos + 1)) != -1) {
if (input[currPos-1] == "\\") continue;
var recollect = input.substr(prevPos + 1, currPos - prevPos - 1);
prevPos = currPos;
output.push(recollect);
}
var recollect = input.substr(prevPos + 1);
output.push(recollect);
return output;
}
doSplit('1|2|3\\|4|5'); //returns [ '1', '2', '3\\|4', '5' ]

startswith in javascript error

I'm using startswith reg exp in Javascript
if ((words).match("^" + string))
but if I enter the characters like , ] [ \ /, Javascript throws an exception.
Any idea?
If you're matching using a regular expression you must make sure you pass a valid Regular Expression to match(). Check the list of special characters to make sure you don't pass an invalid regular expression. The following characters should always be escaped (place a \ before it): [\^$.|?*+()
A better solution would be to use substr() like this:
if( str === words.substr( 0, str.length ) ) {
// match
}
or a solution using indexOf is a (which looks a bit cleaner):
if( 0 === words.indexOf( str ) ) {
// match
}
next you can add a startsWith() method to the string prototype that includes any of the above two solutions to make usage more readable:
String.prototype.startsWith = function(str) {
return ( str === this.substr( 0, str.length ) );
}
When added to the prototype you can use it like this:
words.startsWith( "word" );
One could also use indexOf to determine if the string begins with a fixed value:
str.indexOf(prefix) === 0
If you want to check if a string starts with a fixed value, you could also use substr:
words.substr(0, string.length) === string
If you really want to use regex you have to escape special characters in your string. PHP has a function for it but I don't know any for JavaScript. Try using following function that I found from [Snipplr][1]
function escapeRegEx(str)
{
var specials = new RegExp("[.*+?|()\\[\\]{}\\\\]", "g"); // .*+?|()[]{}\
return str.replace(specials, "\\$&");
}
and use as
var mystring="Some text";
mystring=escapeRegEx(mystring);
If you only need to find strings starting with another string try following
String.prototype.startsWith=function(string) {
return this.indexOf(string) === 0;
}
and use as
var mystring="Some text";
alert(mystring.startsWith("Some"));

Categories

Resources