How do I split a string into an array of characters? [duplicate] - javascript

This question already has answers here:
How to get character array from a string?
(14 answers)
Closed 5 years ago.
var s = "overpopulation";
var ar = [];
ar = s.split();
alert(ar);
I want to string.split a word into array of characters.
The above code doesn't seem to work - it returns "overpopulation" as Object..
How do i split it into array of characters, if original string doesn't contain commas and whitespace?

You can split on an empty string:
var chars = "overpopulation".split('');
If you just want to access a string in an array-like fashion, you can do that without split:
var s = "overpopulation";
for (var i = 0; i < s.length; i++) {
console.log(s.charAt(i));
}
You can also access each character with its index using normal array syntax. Note, however, that strings are immutable, which means you can't set the value of a character using this method, and that it isn't supported by IE7 (if that still matters to you).
var s = "overpopulation";
console.log(s[3]); // logs 'r'

Old question but I should warn:
Do NOT use .split('')
You'll get weird results with non-BMP (non-Basic-Multilingual-Plane) character sets.
Reason is that methods like .split() and .charCodeAt() only respect the characters with a code point below 65536; bec. higher code points are represented by a pair of (lower valued) "surrogate" pseudo-characters.
'πŸ™πŸšπŸ›'.length // β€”> 6
'πŸ™πŸšπŸ›'.split('') // β€”> ["οΏ½", "οΏ½", "οΏ½", "οΏ½", "οΏ½", "οΏ½"]
'😎'.length // β€”> 2
'😎'.split('') // β€”> ["οΏ½", "οΏ½"]
Use ES2015 (ES6) features where possible:
Using the spread operator:
let arr = [...str];
Or Array.from
let arr = Array.from(str);
Or split with the new u RegExp flag:
let arr = str.split(/(?!$)/u);
Examples:
[...'πŸ™πŸšπŸ›'] // β€”> ["πŸ™", "𝟚", "πŸ›"]
[...'πŸ˜ŽπŸ˜œπŸ™ƒ'] // β€”> ["😎", "😜", "πŸ™ƒ"]
For ES5, options are limited:
I came up with this function that internally uses MDN example to get the correct code point of each character.
function stringToArray() {
var i = 0,
arr = [],
codePoint;
while (!isNaN(codePoint = knownCharCodeAt(str, i))) {
arr.push(String.fromCodePoint(codePoint));
i++;
}
return arr;
}
This requires knownCharCodeAt() function and for some browsers; a String.fromCodePoint() polyfill.
if (!String.fromCodePoint) {
// ES6 Unicode Shims 0.1 , Β© 2012 Steven Levithan , MIT License
String.fromCodePoint = function fromCodePoint () {
var chars = [], point, offset, units, i;
for (i = 0; i < arguments.length; ++i) {
point = arguments[i];
offset = point - 0x10000;
units = point > 0xFFFF ? [0xD800 + (offset >> 10), 0xDC00 + (offset & 0x3FF)] : [point];
chars.push(String.fromCharCode.apply(null, units));
}
return chars.join("");
}
}
Examples:
stringToArray('πŸ™πŸšπŸ›') // β€”> ["πŸ™", "𝟚", "πŸ›"]
stringToArray('πŸ˜ŽπŸ˜œπŸ™ƒ') // β€”> ["😎", "😜", "πŸ™ƒ"]
Note: str[index] (ES5) and str.charAt(index) will also return weird results with non-BMP charsets. e.g. '😎'.charAt(0) returns "�".
UPDATE: Read this nice article about JS and unicode.

.split('') splits emojis in half.
Onur's solutions work for some emojis, but can't handle more complex languages or combined emojis.
Consider this emoji being ruined:
[..."πŸ³οΈβ€πŸŒˆ"] // returns ["🏳", "️", "‍", "🌈"] instead of ["πŸ³οΈβ€πŸŒˆ"]
Also consider this Hindi text ΰ€…ΰ€¨ΰ₯ΰ€šΰ₯ΰ€›ΰ₯‡ΰ€¦ which is split like this:
[..."ΰ€…ΰ€¨ΰ₯ΰ€šΰ₯ΰ€›ΰ₯‡ΰ€¦"] // returns ["ΰ€…", "ΰ€¨", "ΰ₯", "ΰ€š", "ΰ₯", "ΰ€›", "ΰ₯‡", "ΰ€¦"]
but should in fact be split like this:
["ΰ€…","ΰ€¨ΰ₯","ΰ€šΰ₯","ΰ€›ΰ₯‡","ΰ€¦"]
This happens because some of the characters are combining marks (think diacritics/accents in European languages).
You can use the grapheme-splitter library for this:
It does proper standards-based letter split in all the hundreds of exotic edge-cases - yes, there are that many.

It's as simple as:
s.split("");
The delimiter is an empty string, hence it will break up between each single character.

The split() method in javascript accepts two parameters: a separator and a limit.
The separator specifies the character to use for splitting the string. If you don't specify a separator, the entire string is returned, non-separated. But, if you specify the empty string as a separator, the string is split between each character.
Therefore:
s.split('')
will have the effect you seek.
More information here

A string in Javascript is already a character array.
You can simply access any character in the array as you would any other array.
var s = "overpopulation";
alert(s[0]) // alerts o.
UPDATE
As is pointed out in the comments below, the above method for accessing a character in a string is part of ECMAScript 5 which certain browsers may not conform to.
An alternative method you can use is charAt(index).
var s = "overpopulation";
alert(s.charAt(0)) // alerts o.

To support emojis use this
('Dragon πŸ‰').split(/(?!$)/u);
=> ['D', 'r', 'a', 'g', 'o', 'n', ' ', 'πŸ‰']

You can use the regular expression /(?!$)/:
"overpopulation".split(/(?!$)/)
The negative look-ahead assertion (?!$) will match right in front of every character.

Related

Trying to design a WORD SEARCH puzzle with Unicode Letters (TAMIL) Using HTML and JAVASCRIPT [duplicate]

Splitting a JavaScript string into "characters" can be done trivially but there are problems if you care about Unicode (and you should care about Unicode).
JavaScript natively treats characters as 16-bit entities (UCS-2 or UTF-16) but this does not allow for Unicode characters outside the BMP (Basic Multilingual Plane).
To deal with Unicode characters beyond the BMP, JavaScript must take into account "surrogate pairs", which it does not do natively.
I'm looking for how to split a js string by codepoint, whether the codepoints require one or two JavaScript "characters" (code units).
Depending on your needs, splitting by codepoint might not be enough, and you might want to split by "grapheme cluster", where a cluster is a base codepoint followed by all its non-spacing modifier codepoints, such as combining accents and diacritics.
For the purposes of this question I do not require splitting by grapheme cluster.
#bobince's answer has (luckily) become a bit dated; you can now simply use
var chars = Array.from( text )
to obtain a list of single-codepoint strings which does respect astral / 32bit / surrogate Unicode characters.
Along the lines of #John Frazer's answer, one can use this even succincter form of string iteration:
const chars = [...text]
e.g., with:
const text = 'A\uD835\uDC68B\uD835\uDC69C\uD835\uDC6A'
const chars = [...text] // ["A", "𝑨", "B", "𝑩", "C", "π‘ͺ"]
In ECMAScript 6 you'll be able to use a string as an iterator to get code points, or you could search a string for /./ug, or you could call getCodePointAt(i) repeatedly.
Unfortunately for..of syntax and regexp flags can't be polyfilled and calling a polyfilled getCodePoint() would be super slow (O(nΒ²)), so we can't realistically use this approach for a while yet.
So doing it the manual way:
String.prototype.toCodePoints= function() {
chars = [];
for (var i= 0; i<this.length; i++) {
var c1= this.charCodeAt(i);
if (c1>=0xD800 && c1<0xDC00 && i+1<this.length) {
var c2= this.charCodeAt(i+1);
if (c2>=0xDC00 && c2<0xE000) {
chars.push(0x10000 + ((c1-0xD800)<<10) + (c2-0xDC00));
i++;
continue;
}
}
chars.push(c1);
}
return chars;
}
For the inverse to this see https://stackoverflow.com/a/3759300/18936
Another method using codePointAt:
String.prototype.toCodePoints = function () {
var arCP = [];
for (var i = 0; i < this.length; i += 1) {
var cP = this.codePointAt(i);
arCP.push(cP);
if (cP >= 0x10000) {
i += 1;
}
}
return arCP;
}

Split JavaScript string into array of codepoints? (taking into account "surrogate pairs" but not "grapheme clusters")

Splitting a JavaScript string into "characters" can be done trivially but there are problems if you care about Unicode (and you should care about Unicode).
JavaScript natively treats characters as 16-bit entities (UCS-2 or UTF-16) but this does not allow for Unicode characters outside the BMP (Basic Multilingual Plane).
To deal with Unicode characters beyond the BMP, JavaScript must take into account "surrogate pairs", which it does not do natively.
I'm looking for how to split a js string by codepoint, whether the codepoints require one or two JavaScript "characters" (code units).
Depending on your needs, splitting by codepoint might not be enough, and you might want to split by "grapheme cluster", where a cluster is a base codepoint followed by all its non-spacing modifier codepoints, such as combining accents and diacritics.
For the purposes of this question I do not require splitting by grapheme cluster.
#bobince's answer has (luckily) become a bit dated; you can now simply use
var chars = Array.from( text )
to obtain a list of single-codepoint strings which does respect astral / 32bit / surrogate Unicode characters.
Along the lines of #John Frazer's answer, one can use this even succincter form of string iteration:
const chars = [...text]
e.g., with:
const text = 'A\uD835\uDC68B\uD835\uDC69C\uD835\uDC6A'
const chars = [...text] // ["A", "𝑨", "B", "𝑩", "C", "π‘ͺ"]
In ECMAScript 6 you'll be able to use a string as an iterator to get code points, or you could search a string for /./ug, or you could call getCodePointAt(i) repeatedly.
Unfortunately for..of syntax and regexp flags can't be polyfilled and calling a polyfilled getCodePoint() would be super slow (O(nΒ²)), so we can't realistically use this approach for a while yet.
So doing it the manual way:
String.prototype.toCodePoints= function() {
chars = [];
for (var i= 0; i<this.length; i++) {
var c1= this.charCodeAt(i);
if (c1>=0xD800 && c1<0xDC00 && i+1<this.length) {
var c2= this.charCodeAt(i+1);
if (c2>=0xDC00 && c2<0xE000) {
chars.push(0x10000 + ((c1-0xD800)<<10) + (c2-0xDC00));
i++;
continue;
}
}
chars.push(c1);
}
return chars;
}
For the inverse to this see https://stackoverflow.com/a/3759300/18936
Another method using codePointAt:
String.prototype.toCodePoints = function () {
var arCP = [];
for (var i = 0; i < this.length; i += 1) {
var cP = this.codePointAt(i);
arCP.push(cP);
if (cP >= 0x10000) {
i += 1;
}
}
return arCP;
}

split string only on first instance of specified character

In my code I split a string based on _ and grab the second item in the array.
var element = $(this).attr('class');
var field = element.split('_')[1];
Takes good_luck and provides me with luck. Works great!
But, now I have a class that looks like good_luck_buddy. How do I get my javascript to ignore the second _ and give me luck_buddy?
I found this var field = element.split(new char [] {'_'}, 2); in a c# stackoverflow answer but it doesn't work. I tried it over at jsFiddle...
Use capturing parentheses:
'good_luck_buddy'.split(/_(.*)/s)
['good', 'luck_buddy', ''] // ignore the third element
They are defined as
If separator contains capturing parentheses, matched results are returned in the array.
So in this case we want to split at _.* (i.e. split separator being a sub string starting with _) but also let the result contain some part of our separator (i.e. everything after _).
In this example our separator (matching _(.*)) is _luck_buddy and the captured group (within the separator) is lucky_buddy. Without the capturing parenthesis the luck_buddy (matching .*) would've not been included in the result array as it is the case with simple split that separators are not included in the result.
We use the s regex flag to make . match on newline (\n) characters as well, otherwise it would only split to the first newline.
What do you need regular expressions and arrays for?
myString = myString.substring(myString.indexOf('_')+1)
var myString= "hello_there_how_are_you"
myString = myString.substring(myString.indexOf('_')+1)
console.log(myString)
I avoid RegExp at all costs. Here is another thing you can do:
"good_luck_buddy".split('_').slice(1).join('_')
With help of destructuring assignment it can be more readable:
let [first, ...rest] = "good_luck_buddy".split('_')
rest = rest.join('_')
A simple ES6 way to get both the first key and remaining parts in a string would be:
const [key, ...rest] = "good_luck_buddy".split('_')
const value = rest.join('_')
console.log(key, value) // good, luck_buddy
Nowadays String.prototype.split does indeed allow you to limit the number of splits.
str.split([separator[, limit]])
...
limit Optional
A non-negative integer limiting the number of splits. If provided, splits the string at each occurrence of the specified separator, but stops when limit entries have been placed in the array. Any leftover text is not included in the array at all.
The array may contain fewer entries than limit if the end of the string is reached before the limit is reached.
If limit is 0, no splitting is performed.
caveat
It might not work the way you expect. I was hoping it would just ignore the rest of the delimiters, but instead, when it reaches the limit, it splits the remaining string again, omitting the part after the split from the return results.
let str = 'A_B_C_D_E'
const limit_2 = str.split('_', 2)
limit_2
(2)Β ["A", "B"]
const limit_3 = str.split('_', 3)
limit_3
(3)Β ["A", "B", "C"]
I was hoping for:
let str = 'A_B_C_D_E'
const limit_2 = str.split('_', 2)
limit_2
(2)Β ["A", "B_C_D_E"]
const limit_3 = str.split('_', 3)
limit_3
(3)Β ["A", "B", "C_D_E"]
This solution worked for me
var str = "good_luck_buddy";
var index = str.indexOf('_');
var arr = [str.slice(0, index), str.slice(index + 1)];
//arr[0] = "good"
//arr[1] = "luck_buddy"
OR
var str = "good_luck_buddy";
var index = str.indexOf('_');
var [first, second] = [str.slice(0, index), str.slice(index + 1)];
//first = "good"
//second = "luck_buddy"
You can use the regular expression like:
var arr = element.split(/_(.*)/)
You can use the second parameter which specifies the limit of the split.
i.e:
var field = element.split('_', 1)[1];
Replace the first instance with a unique placeholder then split from there.
"good_luck_buddy".replace(/\_/,'&').split('&')
["good","luck_buddy"]
This is more useful when both sides of the split are needed.
I need the two parts of string, so, regex lookbehind help me with this.
const full_name = 'Maria do Bairro';
const [first_name, last_name] = full_name.split(/(?<=^[^ ]+) /);
console.log(first_name);
console.log(last_name);
Non-regex solution
I ran some benchmarks, and this solution won hugely:1
str.slice(str.indexOf(delim) + delim.length)
// as function
function gobbleStart(str, delim) {
return str.slice(str.indexOf(delim) + delim.length);
}
// as polyfill
String.prototype.gobbleStart = function(delim) {
return this.slice(this.indexOf(delim) + delim.length);
};
Performance comparison with other solutions
The only close contender was the same line of code, except using substr instead of slice.
Other solutions I tried involving split or RegExps took a big performance hit and were about 2 orders of magnitude slower. Using join on the results of split, of course, adds an additional performance penalty.
Why are they slower? Any time a new object or array has to be created, JS has to request a chunk of memory from the OS. This process is very slow.
Here are some general guidelines, in case you are chasing benchmarks:
New dynamic memory allocations for objects {} or arrays [] (like the one that split creates) will cost a lot in performance.
RegExp searches are more complicated and therefore slower than string searches.
If you already have an array, destructuring arrays is about as fast as explicitly indexing them, and looks awesome.
Removing beyond the first instance
Here's a solution that will slice up to and including the nth instance. It's not quite as fast, but on the OP's question, gobble(element, '_', 1) is still >2x faster than a RegExp or split solution and can do more:
/*
`gobble`, given a positive, non-zero `limit`, deletes
characters from the beginning of `haystack` until `needle` has
been encountered and deleted `limit` times or no more instances
of `needle` exist; then it returns what remains. If `limit` is
zero or negative, delete from the beginning only until `-(limit)`
occurrences or less of `needle` remain.
*/
function gobble(haystack, needle, limit = 0) {
let remain = limit;
if (limit <= 0) { // set remain to count of delim - num to leave
let i = 0;
while (i < haystack.length) {
const found = haystack.indexOf(needle, i);
if (found === -1) {
break;
}
remain++;
i = found + needle.length;
}
}
let i = 0;
while (remain > 0) {
const found = haystack.indexOf(needle, i);
if (found === -1) {
break;
}
remain--;
i = found + needle.length;
}
return haystack.slice(i);
}
With the above definition, gobble('path/to/file.txt', '/') would give the name of the file, and gobble('prefix_category_item', '_', 1) would remove the prefix like the first solution in this answer.
Tests were run in Chrome 70.0.3538.110 on macOSX 10.14.
Use the string replace() method with a regex:
var result = "good_luck_buddy".replace(/.*?_/, "");
console.log(result);
This regex matches 0 or more characters before the first _, and the _ itself. The match is then replaced by an empty string.
Javascript's String.split unfortunately has no way of limiting the actual number of splits. It has a second argument that specifies how many of the actual split items are returned, which isn't useful in your case. The solution would be to split the string, shift the first item off, then rejoin the remaining items::
var element = $(this).attr('class');
var parts = element.split('_');
parts.shift(); // removes the first item from the array
var field = parts.join('_');
Here's one RegExp that does the trick.
'good_luck_buddy' . split(/^.*?_/)[1]
First it forces the match to start from the
start with the '^'. Then it matches any number
of characters which are not '_', in other words
all characters before the first '_'.
The '?' means a minimal number of chars
that make the whole pattern match are
matched by the '.*?' because it is followed
by '_', which is then included in the match
as its last character.
Therefore this split() uses such a matching
part as its 'splitter' and removes it from
the results. So it removes everything
up till and including the first '_' and
gives you the rest as the 2nd element of
the result. The first element is "" representing
the part before the matched part. It is
"" because the match starts from the beginning.
There are other RegExps that work as
well like /_(.*)/ given by Chandu
in a previous answer.
The /^.*?_/ has the benefit that you
can understand what it does without
having to know about the special role
capturing groups play with replace().
if you are looking for a more modern way of doing this:
let raw = "good_luck_buddy"
raw.split("_")
.filter((part, index) => index !== 0)
.join("_")
Mark F's solution is awesome but it's not supported by old browsers. Kennebec's solution is awesome and supported by old browsers but doesn't support regex.
So, if you're looking for a solution that splits your string only once, that is supported by old browsers and supports regex, here's my solution:
String.prototype.splitOnce = function(regex)
{
var match = this.match(regex);
if(match)
{
var match_i = this.indexOf(match[0]);
return [this.substring(0, match_i),
this.substring(match_i + match[0].length)];
}
else
{ return [this, ""]; }
}
var str = "something/////another thing///again";
alert(str.splitOnce(/\/+/)[1]);
For beginner like me who are not used to Regular Expression, this workaround solution worked:
var field = "Good_Luck_Buddy";
var newString = field.slice( field.indexOf("_")+1 );
slice() method extracts a part of a string and returns a new string and indexOf() method returns the position of the first found occurrence of a specified value in a string.
This should be quite fast
function splitOnFirst (str, sep) {
const index = str.indexOf(sep);
return index < 0 ? [str] : [str.slice(0, index), str.slice(index + sep.length)];
}
console.log(splitOnFirst('good_luck', '_')[1])
console.log(splitOnFirst('good_luck_buddy', '_')[1])
This worked for me on Chrome + FF:
"foo=bar=beer".split(/^[^=]+=/)[1] // "bar=beer"
"foo==".split(/^[^=]+=/)[1] // "="
"foo=".split(/^[^=]+=/)[1] // ""
"foo".split(/^[^=]+=/)[1] // undefined
If you also need the key try this:
"foo=bar=beer".split(/^([^=]+)=/) // Array [ "", "foo", "bar=beer" ]
"foo==".split(/^([^=]+)=/) // [ "", "foo", "=" ]
"foo=".split(/^([^=]+)=/) // [ "", "foo", "" ]
"foo".split(/^([^=]+)=/) // [ "foo" ]
//[0] = ignored (holds the string when there's no =, empty otherwise)
//[1] = hold the key (if any)
//[2] = hold the value (if any)
a simple es6 one statement solution to get the first key and remaining parts
let raw = 'good_luck_buddy'
raw.split('_')
.reduce((p, c, i) => i === 0 ? [c] : [p[0], [...p.slice(1), c].join('_')], [])
You could also use non-greedy match, it's just a single, simple line:
a = "good_luck_buddy"
const [,g,b] = a.match(/(.*?)_(.*)/)
console.log(g,"and also",b)

Javascript: How to remove characters from end of string? [duplicate]

I have a string, 12345.00, and I would like it to return 12345.0.
I have looked at trim, but it looks like it is only trimming whitespace and slice which I don't see how this would work. Any suggestions?
You can use the substring function:
let str = "12345.00";
str = str.substring(0, str.length - 1);
console.log(str);
This is the accepted answer, but as per the conversations below, the slice syntax is much clearer:
let str = "12345.00";
str = str.slice(0, -1);
console.log(str);
You can use slice! You just have to make sure you know how to use it. Positive #s are relative to the beginning, negative numbers are relative to the end.
js>"12345.00".slice(0,-1)
12345.0
You can use the substring method of JavaScript string objects:
s = s.substring(0, s.length - 4)
It unconditionally removes the last four characters from string s.
However, if you want to conditionally remove the last four characters, only if they are exactly _bar:
var re = /_bar$/;
s.replace(re, "");
The easiest method is to use the slice method of the string, which allows negative positions (corresponding to offsets from the end of the string):
const s = "your string";
const withoutLastFourChars = s.slice(0, -4);
If you needed something more general to remove everything after (and including) the last underscore, you could do the following (so long as s is guaranteed to contain at least one underscore):
const s = "your_string";
const withoutLastChunk = s.slice(0, s.lastIndexOf("_"));
console.log(withoutLastChunk);
For a number like your example, I would recommend doing this over substring:
console.log(parseFloat('12345.00').toFixed(1));
Do note that this will actually round the number, though, which I would imagine is desired but maybe not:
console.log(parseFloat('12345.46').toFixed(1));
Be aware that String.prototype.{ split, slice, substr, substring } operate on UTF-16 encoded strings
None of the previous answers are Unicode-aware.
Strings are encoded as UTF-16 in most modern JavaScript engines, but higher Unicode code points require surrogate pairs, so older, pre-existing string methods operate on UTF-16 code units, not Unicode code points.
See: Do NOT use .split('').
const string = "ẞ🦊";
console.log(string.slice(0, -1)); // "ẞ\ud83e"
console.log(string.substr(0, string.length - 1)); // "ẞ\ud83e"
console.log(string.substring(0, string.length - 1)); // "ẞ\ud83e"
console.log(string.replace(/.$/, "")); // "ẞ\ud83e"
console.log(string.match(/(.*).$/)[1]); // "ẞ\ud83e"
const utf16Chars = string.split("");
utf16Chars.pop();
console.log(utf16Chars.join("")); // "ẞ\ud83e"
In addition, RegExp methods, as suggested in older answers, don’t match line breaks at the end:
const string = "Hello, world!\n";
console.log(string.replace(/.$/, "").endsWith("\n")); // true
console.log(string.match(/(.*).$/) === null); // true
Use the string iterator to iterate characters
Unicode-aware code utilizes the string’s iterator; see Array.from and ... spread.
string[Symbol.iterator] can be used (e.g. instead of string) as well.
Also see How to split Unicode string to characters in JavaScript.
Examples:
const string = "ẞ🦊";
console.log(Array.from(string).slice(0, -1).join("")); // "ẞ"
console.log([
...string
].slice(0, -1).join("")); // "ẞ"
Use the s and u flags on a RegExp
The dotAll or s flag makes . match line break characters, the unicode or u flag enables certain Unicode-related features.
Note that, when using the u flag, you eliminate unnecessary identity escapes, as these are invalid in a u regex, e.g. \[ is fine, as it would start a character class without the backslash, but \: isn’t, as it’s a : with or without the backslash, so you need to remove the backslash.
Examples:
const unicodeString = "ẞ🦊",
lineBreakString = "Hello, world!\n";
console.log(lineBreakString.replace(/.$/s, "").endsWith("\n")); // false
console.log(lineBreakString.match(/(.*).$/s) === null); // false
console.log(unicodeString.replace(/.$/su, "")); // ẞ
console.log(unicodeString.match(/(.*).$/su)[1]); // ẞ
// Now `split` can be made Unicode-aware:
const unicodeCharacterArray = unicodeString.split(/(?:)/su),
lineBreakCharacterArray = lineBreakString.split(/(?:)/su);
unicodeCharacterArray.pop();
lineBreakCharacterArray.pop();
console.log(unicodeCharacterArray.join("")); // "ẞ"
console.log(lineBreakCharacterArray.join("").endsWith("\n")); // false
Note that some graphemes consist of more than one code point, e.g. πŸ³οΈβ€πŸŒˆ which consists of the sequence 🏳 (U+1F3F3), VS16 (U+FE0F), ZWJ (U+200D), 🌈 (U+1F308).
Here, even Array.from will split this into four β€œcharacters”.
Matching those is made easier with the RegExp set notation and properties of strings proposal.
Using JavaScript's slice function:
let string = 'foo_bar';
string = string.slice(0, -4); // Slice off last four characters here
console.log(string);
This could be used to remove '_bar' at end of a string, of any length.
A regular expression is what you are looking for:
let str = "foo_bar";
console.log(str.replace(/_bar$/, ""));
Try this:
const myString = "Hello World!";
console.log(myString.slice(0, -1));
Performance
Today 2020.05.13 I perform tests of chosen solutions on Chrome v81.0, Safari v13.1 and Firefox v76.0 on MacOs High Sierra v10.13.6.
Conclusions
the slice(0,-1)(D) is fast or fastest solution for short and long strings and it is recommended as fast cross-browser solution
solutions based on substring (C) and substr(E) are fast
solutions based on regular expressions (A,B) are slow/medium fast
solutions B, F and G are slow for long strings
solution F is slowest for short strings, G is slowest for long strings
Details
I perform two tests for solutions A, B, C, D, E(ext), F, G(my)
for 8-char short string (from OP question) - you can run it HERE
for 1M long string - you can run it HERE
Solutions are presented in below snippet
function A(str) {
return str.replace(/.$/, '');
}
function B(str) {
return str.match(/(.*).$/)[1];
}
function C(str) {
return str.substring(0, str.length - 1);
}
function D(str) {
return str.slice(0, -1);
}
function E(str) {
return str.substr(0, str.length - 1);
}
function F(str) {
let s= str.split("");
s.pop();
return s.join("");
}
function G(str) {
let s='';
for(let i=0; i<str.length-1; i++) s+=str[i];
return s;
}
// ---------
// TEST
// ---------
let log = (f)=>console.log(`${f.name}: ${f("12345.00")}`);
[A,B,C,D,E,F,G].map(f=>log(f));
This snippet only presents soutions
Here are example results for Chrome for short string
Use regex:
let aStr = "12345.00";
aStr = aStr.replace(/.$/, '');
console.log(aStr);
How about:
let myString = "12345.00";
console.log(myString.substring(0, myString.length - 1));
1. (.*), captures any character multiple times:
console.log("a string".match(/(.*).$/)[1]);
2. ., matches last character, in this case:
console.log("a string".match(/(.*).$/));
3. $, matches the end of the string:
console.log("a string".match(/(.*).{2}$/)[1]);
https://stackoverflow.com/questions/34817546/javascript-how-to-delete-last-two-characters-in-a-string
Just use trim if you don't want spaces
"11.01 Β°C".slice(0,-2).trim()
Here is an alternative that i don't think i've seen in the other answers, just for fun.
var strArr = "hello i'm a string".split("");
strArr.pop();
document.write(strArr.join(""));
Not as legible or simple as slice or substring but does allow you to play with the string using some nice array methods, so worth knowing.
debris = string.split("_") //explode string into array of strings indexed by "_"
debris.pop(); //pop last element off the array (which you didn't want)
result = debris.join("_"); //fuse the remainng items together like the sun
If you want to do generic rounding of floats, instead of just trimming the last character:
var float1 = 12345.00,
float2 = 12345.4567,
float3 = 12345.982;
var MoreMath = {
/**
* Rounds a value to the specified number of decimals
* #param float value The value to be rounded
* #param int nrDecimals The number of decimals to round value to
* #return float value rounded to nrDecimals decimals
*/
round: function (value, nrDecimals) {
var x = nrDecimals > 0 ? 10 * parseInt(nrDecimals, 10) : 1;
return Math.round(value * x) / x;
}
}
MoreMath.round(float1, 1) => 12345.0
MoreMath.round(float2, 1) => 12345.5
MoreMath.round(float3, 1) => 12346.0
EDIT: Seems like there exists a built in function for this, as Paolo points out. That solution is obviously much cleaner than mine. Use parseFloat followed by toFixed
if(str.substring(str.length - 4) == "_bar")
{
str = str.substring(0, str.length - 4);
}
Via slice(indexStart, indexEnd) method - note, this does NOT CHANGE the existing string, it creates a copy and changes the copy.
console.clear();
let str = "12345.00";
let a = str.slice(0, str.length -1)
console.log(a, "<= a");
console.log(str, "<= str is NOT changed");
Via Regular Expression method - note, this does NOT CHANGE the existing string, it creates a copy and changes the copy.
console.clear();
let regExp = /.$/g
let b = str.replace(regExp,"")
console.log(b, "<= b");
console.log(str, "<= str is NOT changed");
Via array.splice() method -> this only works on arrays, and it CHANGES, the existing array (so careful with this one), you'll need to convert a string to an array first, then back.
console.clear();
let str = "12345.00";
let strToArray = str.split("")
console.log(strToArray, "<= strToArray");
let spliceMethod = strToArray.splice(str.length-1, 1)
str = strToArray.join("")
console.log(str, "<= str is changed now");
In cases where you want to remove something that is close to the end of a string (in case of variable sized strings) you can combine slice() and substr().
I had a string with markup, dynamically built, with a list of anchor tags separated by comma. The string was something like:
var str = "<a>text 1,</a><a>text 2,</a><a>text 2.3,</a><a>text abc,</a>";
To remove the last comma I did the following:
str = str.slice(0, -5) + str.substr(-4);
You can, in fact, remove the last arr.length - 2 items of an array using arr.length = 2, which if the array length was 5, would remove the last 3 items.
Sadly, this does not work for strings, but we can use split() to split the string, and then join() to join the string after we've made any modifications.
var str = 'string'
String.prototype.removeLast = function(n) {
var string = this.split('')
string.length = string.length - n
return string.join('')
}
console.log(str.removeLast(3))
Try to use toFixed
const str = "12345.00";
return (+str).toFixed(1);
Try this:
<script>
var x="foo_foo_foo_bar";
for (var i=0; i<=x.length; i++) {
if (x[i]=="_" && x[i+1]=="b") {
break;
}
else {
document.write(x[i]);
}
}
</script>
You can also try the live working example on http://jsfiddle.net/informativejavascript/F7WTn/87/.
#Jason S:
You can use slice! You just have to
make sure you know how to use it.
Positive #s are relative to the
beginning, negative numbers are
relative to the end.
js>"12345.00".slice(0,-1)
12345.0
Sorry for my graphomany but post was tagged 'jquery' earlier. So, you can't use slice() inside jQuery because slice() is jQuery method for operations with DOM elements, not substrings ...
In other words answer #Jon Erickson suggest really perfect solution.
However, your method will works out of jQuery function, inside simple Javascript.
Need to say due to last discussion in comments, that jQuery is very much more often renewable extension of JS than his own parent most known ECMAScript.
Here also exist two methods:
as our:
string.substring(from,to) as plus if 'to' index nulled returns the rest of string. so:
string.substring(from) positive or negative ...
and some other - substr() - which provide range of substring and 'length' can be positive only:
string.substr(start,length)
Also some maintainers suggest that last method string.substr(start,length) do not works or work with error for MSIE.
Use substring to get everything to the left of _bar. But first you have to get the instr of _bar in the string:
str.substring(3, 7);
3 is that start and 7 is the length.

Regex using javascript to return just numbers

If I have a string like "something12" or "something102", how would I use a regex in javascript to return just the number parts?
Regular expressions:
var numberPattern = /\d+/g;
'something102asdfkj1948948'.match( numberPattern )
This would return an Array with two elements inside, '102' and '1948948'. Operate as you wish. If it doesn't match any it will return null.
To concatenate them:
'something102asdfkj1948948'.match( numberPattern ).join('')
Assuming you're not dealing with complex decimals, this should suffice I suppose.
You could also strip all the non-digit characters (\D or [^0-9]):
let word_With_Numbers = 'abc123c def4567hij89'
let word_Without_Numbers = word_With_Numbers.replace(/\D/g, '');
console.log(word_Without_Numbers)
For number with decimal fraction and minus sign, I use this snippet:
const NUMERIC_REGEXP = /[-]{0,1}[\d]*[.]{0,1}[\d]+/g;
const numbers = '2.2px 3.1px 4px -7.6px obj.key'.match(NUMERIC_REGEXP)
console.log(numbers); // ["2.2", "3.1", "4", "-7.6"]
Update: - 7/9/2018
Found a tool which allows you to edit regular expression visually: JavaScript Regular Expression Parser & Visualizer.
Update:
Here's another one with which you can even debugger regexp: Online regex tester and debugger.
Update:
Another one: RegExr.
Update:
Regexper and Regex Pal.
If you want only digits:
var value = '675-805-714';
var numberPattern = /\d+/g;
value = value.match( numberPattern ).join([]);
alert(value);
//Show: 675805714
Now you get the digits joined
I guess you want to get number(s) from the string. In which case, you can use the following:
// Returns an array of numbers located in the string
function get_numbers(input) {
return input.match(/[0-9]+/g);
}
var first_test = get_numbers('something102');
var second_test = get_numbers('something102or12');
var third_test = get_numbers('no numbers here!');
alert(first_test); // [102]
alert(second_test); // [102,12]
alert(third_test); // null
IMO the #3 answer at this time by Chen Dachao is the right way to go if you want to capture any kind of number, but the regular expression can be shortened from:
/[-]{0,1}[\d]*[\.]{0,1}[\d]+/g
to:
/-?\d*\.?\d+/g
For example, this code:
"lin-grad.ient(217deg,rgba(255, 0, 0, -0.8), rgba(-255,0,0,0) 70.71%)".match(/-?\d*\.?\d+/g)
generates this array:
["217","255","0","0","-0.8","-255","0","0","0","70.71"]
I've butchered an MDN linear gradient example so that it fully tests the regexp and doesn't need to scroll here. I think I've included all the possibilities in terms of negative numbers, decimals, unit suffixes like deg and %, inconsistent comma and space usage, and the extra dot/period and hyphen/dash characters within the text "lin-grad.ient". Please let me know if I'm missing something. The only thing I can see that it does not handle is a badly formed decimal number like "0..8".
If you really want an array of numbers, you can convert the entire array in the same line of code:
array = whatever.match(/-?\d*\.?\d+/g).map(Number);
My particular code, which is parsing CSS functions, doesn't need to worry about the non-numeric use of the dot/period character, so the regular expression can be even simpler:
/-?[\d\.]+/g
var result = input.match(/\d+/g).join([])
Using split and regex :
var str = "fooBar0123".split(/(\d+)/);
console.log(str[0]); // fooBar
console.log(str[1]); // 0123
The answers given don't actually match your question, which implied a trailing number. Also, remember that you're getting a string back; if you actually need a number, cast the result:
item=item.replace('^.*\D(\d*)$', '$1');
if (!/^\d+$/.test(item)) throw 'parse error: number not found';
item=Number(item);
If you're dealing with numeric item ids on a web page, your code could also usefully accept an Element, extracting the number from its id (or its first parent with an id); if you've an Event handy, you can likely get the Element from that, too.
As per #Syntle's answer, if you have only non numeric characters you'll get an Uncaught TypeError: Cannot read property 'join' of null.
This will prevent errors if no matches are found and return an empty string:
('something'.match( /\d+/g )||[]).join('')
Here is the solution to convert the string to valid plain or decimal numbers using Regex:
//something123.777.321something to 123.777321
const str = 'something123.777.321something';
let initialValue = str.replace(/[^0-9.]+/, '');
//initialValue = '123.777.321';
//characterCount just count the characters in a given string
if (characterCount(intitialValue, '.') > 1) {
const splitedValue = intitialValue.split('.');
//splittedValue = ['123','777','321'];
intitialValue = splitedValue.shift() + '.' + splitedValue.join('');
//result i.e. initialValue = '123.777321'
}
If you want dot/comma separated numbers also, then:
\d*\.?\d*
or
[0-9]*\.?[0-9]*
You can use https://regex101.com/ to test your regexes.
Everything that other solutions have, but with a little validation
// value = '675-805-714'
const validateNumberInput = (value) => {
let numberPattern = /\d+/g
let numbers = value.match(numberPattern)
if (numbers === null) {
return 0
}
return parseInt(numbers.join([]))
}
// 675805714
One liner
I you do not care about decimal numbers and only need the digits, I think this one liner is rather elegant:
/**
* #param {String} str
* #returns {String} - All digits from the given `str`
*/
const getDigitsInString = (str) => str.replace(/[^\d]*/g, '');
console.log([
'?,!_:/42\`"^',
'A 0 B 1 C 2 D 3 E',
' 4 twenty 20 ',
'1413/12/11',
'16:20:42:01'
].map((str) => getDigitsInString(str)));
Simple explanation:
\d matches any digit from 0 to 9
[^n] matches anything that is not n
* matches 0 times or more the predecessor
( It is an attempt to match a whole block of non-digits all at once )
g at the end, indicates that the regex is global to the entire string and that we will not stop at the first occurrence but match every occurrence within it
Together those rules match anything but digits, which we replace by an empty strings. Thus, resulting in a string containing digits only.

Categories

Resources