How does String.length work in JavaScript?

How does String.length work in JavaScript? - javascript

I want to know how is the string length of a string calculated in js.
Is is a function call or a class data member.
I want to know what happens when we execute the following code :
a = 'this is a string';
console.log(a.length); // what actually happens at this point?
Also if a do this :
a += ' added something';
console.log(a.length); // at what point is the new length calculated
//and/or updated for the object 'a';
And at last, do I need to store the string length in a temp variable while using a loop over the string or can I directly use the following (which one is faster/processor efficient) :
for(var i=0;i<a.length;i++){
// doing anything here
}
Summing up my question, I want to know the processing behind String.length and which practice is better while looping over strings?

A string is immutable in JavaScript.
a += "somestring" doesn't change the length of a string but makes a new string.
This means there is no "new length", but the length is just part of the definition of the string (more precisely it is stored in the same structure in implementations).
Regarding
for(i=0;i<a.length;i++){ // did you forget the 'var' keyword ?
a not so uncommon practice (if you don't change a) was to optimize it as
for (var i=0, l=a.length; i<l; i++)
in order to avoid the reading of the length but if you compare the performances with modern engines, you'll see this doesn't make the code any faster now.
What you must remember : querying the length of a string is fast because there is no computation. What's a little less fast is building strings (for example with concatenation).

Strings are a primitive type. At least that's what the documentation says. But we can access the length of the string as if we are accessing the property of an object(with the dot notation). Which indicates it's an object, Right?
Turns out, whenever we make a call from the string primitive to some property using the dot notation (for example, say length), the Js engine will take this primitive string and wrap it into an equivalent wrapper object, which is a String object. And then, the .length on that String object returns the length.
Interesting thing to note here is, that when we do something like this, our string still stays the same primitive string during all of this. And a temporary object is created to make our string operation work. Once the required property is fetched, this temporary object is deleted from the memory.
Hope this gives some high level understanding.

I'm answering your first question.
I'm also curious about this puzzle so I did some search myself, ended up finding -
Based on String documentation from Mozilla:
String literals (denoted by double or single quotes) and strings
returned from String calls in a non-constructor context (i.e., without
using the new keyword) are primitive strings. JavaScript automatically
converts primitives to String objects, so that it's possible to use
String object methods for primitive strings. In contexts where a
method is to be invoked on a primitive string or a property lookup
occurs, JavaScript will automatically wrap the string primitive and
call the method or perform the property lookup.
So as I understand, when you use somestring.length, the primitive string will first be wrapped as a String object, and then since the object has its property length, so it's just a internal method call to access and return.

Related

In what way are datatypes immutable? [duplicate]

If a string is immutable, does that mean that....
(let's assume JavaScript)
var str = 'foo';
alert(str.substr(1)); // oo
alert(str); // foo
Does it mean, when calling methods on a string, it will return the modified string, but it won't change the initial string?
If the string was mutable, does that mean the 2nd alert() would return oo as well?

It means that once you instantiate the object, you can't change its properties. In your first alert you aren't changing foo. You're creating a new string. This is why in your second alert it will show "foo" instead of oo.
Does it mean, when calling methods on
a string, it will return the modified
string, but it won't change the
initial string?
Yes. Nothing can change the string once it is created. Now this doesn't mean that you can't assign a new string object to the str variable. You just can't change the current object that str references.
If the string was mutable, does that
mean the 2nd alert() would return oo
as well?
Technically, no, because the substring method returns a new string. Making an object mutable, wouldn't change the method. Making it mutable means that technically, you could make it so that substring would change the original string instead of creating a new one.

On a lower level, immutability means that the memory the string is stored in will not be modified. Once you create a string "foo", some memory is allocated to store the value "foo". This memory will not be altered. If you modify the string with, say, substr(1), a new string is created and a different part of memory is allocated which will store "oo". Now you have two strings in memory, "foo" and "oo". Even if you're not going to use "foo" anymore, it'll stick around until it's garbage collected.
One reason why string operations are comparatively expensive.

Immutable means that which cannot be changed or modified.
So when you assign a value to a string, this value is created from scratch as opposed to being replaced. So everytime a new value is assigned to the same string, a copy is created. So in reality, you are never changing the original value.

I'm not certain about JavaScript, but in Java, strings take an additional step to immutability, with the "String Constant Pool". Strings can be constructed with string literals ("foo") or with a String class constructor. Strings constructed with string literals are a part of the String Constant Pool, and the same string literal will always be the same memory address from the pool.
Example:
String lit1 = "foo";
String lit2 = "foo";
String cons = new String("foo");
System.out.println(lit1 == lit2); // true
System.out.println(lit1 == cons); // false
System.out.println(lit1.equals(cons)); // true
In the above, both lit1 and lit2 are constructed using the same string literal, so they're pointing at the same memory address; lit1 == lit2 results in true, because they are exactly the same object.
However, cons is constructed using the class constructor. Although the parameter is the same string constant, the constructor allocates new memory for cons, meaning cons is not the same object as lit1 and lit2, despite containing the same data.
Of course, since the three strings all contain the same character data, using the equals method will return true.
(Both types of string construction are immutable, of course)

The text-book definition of mutability is liable or subject to change or alteration.
In programming, we use the word to mean objects whose state is allowed to change over time. An immutable value is the exact opposite – after it has been created, it can never change.
If this seems strange, allow me to remind you that many of the values we use all the time are in fact immutable.
var statement = "I am an immutable value";
var otherStr = statement.slice(8, 17);
I think no one will be surprised to learn that the second line in no way changes the string in statement.
In fact, no string methods change the string they operate on, they all return new strings. The reason is that strings are immutable – they cannot change, we can only ever make new strings.
Strings are not the only immutable values built into JavaScript. Numbers are immutable too. Can you even imagine an environment where evaluating the expression 2 + 3 changes the meaning of the number 2? It sounds absurd, yet we do this with our objects and arrays all the time.

Immutable means the value can not be changed. Once created a string object can not be modified as its immutable. If you request a substring of a string a new String with the requested part is created.
Using StringBuffer while manipulating Strings instead makes the operation more efficient as StringBuffer stores the string in a character array with variables to hold the capacity of the character array and the length of the array(String in a char array form)

From strings to stacks... a simple to understand example taken from Eric Lippert's blog:
Path Finding Using A* in C# 3.0, Part Two...
A mutable stack like System.Collections.Generic.Stack
is clearly not suitable. We want to be
able to take an existing path and
create new paths from it for all of
the neighbours of its last element,
but pushing a new node onto the
standard stack modifies the stack.
We’d have to make copies of the stack
before pushing it, which is silly
because then we’d be duplicating all
of its contents unnecessarily.
Immutable stacks do not have this problem. Pushing onto an immutable
stack merely creates a brand-new stack
which links to the old one as its
tail. Since the stack is immutable,
there is no danger of some other code
coming along and messing with the tail
contents. You can keep on using the
old stack to your heart’s content.
To go deep on understaning immutability, read Eric's posts starting with this one:
Immutability in C# Part One: Kinds of Immutability

One way to get a grasp of this concept is to look at how javascript treats all objects, which is by reference. Meaning that all objects are mutable after being instantiated, this means that you can add an object with new methods and properties. This matters because if you want an object to be immutable the object can not change after being instantiated.

Try This :
let string = "name";
string[0] = "N";
console.log(string); // name not Name
string = "Name";
console.log(string); // Name
So what that means is that string is immutable but not constant, in simple words re-assignment can take place but can not mutate some part.

The text-book definition of mutability is liable or subject to change or alteration. In programming, we use the word to mean objects whose state is allowed to change over time. An immutable value is the exact opposite – after it has been created, it can never change.
If this seems strange, allow me to remind you that many of the values we use all the time are in fact immutable.
var statement = "I am an immutable value"; var otherStr = statement.slice(8, 17);
I think no one will be surprised to learn that the second line in no way changes the string in statement. In fact, no string methods change the string they operate on, they all return new strings. The reason is that strings are immutable – they cannot change, we can only ever make new strings.
Strings are not the only immutable values built into JavaScript. Numbers are immutable too. Can you even imagine an environment where evaluating the expression 2 + 3 changes the meaning of the number 2? It sounds absurd, yet we do this with our objects and arrays all the time.

Closures and functions properties [duplicate]

As per this documentation,
The string representations of each of these objects are appended
together in the order listed and output.
Also as per answer
The + x coerces the object x into a string, which is just [object
Object]:
So, my question is
If I do
str = new String("hello")
console.log(str) //prints the string object but not 'hello'
console.log(""+str) //prints "hello"
So, in first case, it simply prints the object (doesn't invoke the toString() method).
But in second case, it doesn't coerce but simply print the primitive value. Why is that so?
Which method does console.log invokes to print the object?
Please note that - this is not a duplicate of this question.

Console API is not a standard API that is defined in any specification but is something that is implemented across all browsers, so vendors are usually at their liberty to implement in their own fashion as there's no standard spec to define the output of any methods in API.
Unless you check the actual implementation of the Console API for a particular browser, you can never be sure. There's a tracker on GitHub listing the differences between implementation from major browsers.
If you look at the implementation in FF (available here - search for log), it has a comment below
A multi line stringification of an object, designed for use by humans
The actual implementation checks for the type of argument that is passed to log() and based on it's type, it generates a different representation.
Coming to your case, log() prints two different values for strings created using literal notation and strings created using String constructor because they are two different types. As explained here, Strings created using literal notation are called String Primitives and strings created using String constructor are called String Objects.
var str1 = 'test';
var str2 = new String('hello');
typeof str1 // prints "string"
typeof str2 // prints "object"
As the types differ, their string representation differs in the Console API. If you go through the code for FF's Console implementation, the last statement is
return " " + aThing.toString() + "\n";
So to answer your question, Console API in FF calls toString() on the argument only if the argument type is not one of {undefined,null,object,set,map} types. It doesn't always call toString() or valueOf() methods. I didn't check the implementation of Chrome, so I won't comment on that.

It does not utilize toString, you can do something like this
clog = function(msg){console.log(msg.toString());}
clog(myObj);

This is more typing but will invoke obj.toString() as well:
console.log(`${obj}`);

console.log(str) calls str.valueOf() I guess.
From JavaScript- The Definitive Guide
Its job is to convert an object to a primitive value. The valueOf() method is invoked automatically when an object is used in a numeric context, with arithmetic operators (other than +) and with the relational operators, for example. Most objects do not have a reasonable primitive representation and do not define this method.
---edit----Sorry,copy the wrong line, I mean the ""+str,since there's a type converting

Calling toUppercase() on a string variable

I'm a beginner and just successfully trouble-shoot my code. I'm glad that I found it, however it took me a long time. I'm hoping to learn why it happened.
Here's the buggy original code. Assume that the variable [nextAlpha] has already been assigned a string value:
nextAlpha.toUpperCase();
Through some creative testing I was able to determine it was the line causing issues. I thought perhaps it's not actually updating the value of variable [nextAlpha]. I tried this instead, and it worked:
nextAlpha = nextAlpha.toUpperCase();
I've left the rest of my code out, but assume that [var = nextAlpha] has already been declared at the top of my script, which I think means "globally." With that information, I thought it was enough to simply call the method on the variable. Why doesn't this "update" the string to upper case like it does when I go the extra step to (re)assign it to the original [nextAlpha] string?

toUpperCase returns the converted string as a new object - it does not perform the conversion on nextAlpha.
From the Mozilla reference:
The toUpperCase method returns the value of the string converted to uppercase. toUpperCase does not affect the value of the string itself.
reference

In JavaScript, Strings are immutable:
https://developer.mozilla.org/en-US/docs/Web/JavaScript/Data_structures
Unlike in languages like C, JavaScript strings are immutable. This means that once a string is created, it is not possible to modify it. However, it is still possible to create another string based on an operation on the original string

toUpperCase() is a function (so return a value) not a property (affect the variable itself)

Should I convert primitive to object if I want to access its (object's) properties lots of times?

For example, suppose I have a long string primitive called str. I know I can use str.substr because the browser, behind the scenes,
Converts str into a string wrapper object, equivalent to using new String(str)
Calls the substr method with the correct parameters on the String object returned by step 1
Disposes of the String object
Returns the string (primitive) from step 2.
(https://stackoverflow.com/a/9110389/1529630)
But what happens if I want to use str.substr lots of times? For example
var positions = [1, 5, 8, 9, 15, ...], //long array
substrings = new Array(positions.length);
for ( var i = positions.length - 1; i >= 0; --i ) {
substrings[i] = str.substr(positions[i], 2);
}
Should I convert str to an object before that loop in order to avoid creating and destroying an object at every step?

There's a blog entry dedicated to that question here.
In short, working with primitive values is faster most of the time because of optimizations made by most JS engines. However, that might not be the case when accessing non-native members on primitives.
For instance, we can read in the article that:
In the SpiderMonkey JS engine, for example, the pseudo-code that deals
with the “get property” operation looks something like the following:
// direct check for the "length" property
if (typeof(value) == "string" && property == "length") {
return StringLength(value);
}
// generalized code form for properties
object = ToObject(value);
return InternalGetProperty(object, property);
Thus, when you request a property on a string primitive, and the
property name is “length”, the engine immediately just returns its
length, avoiding the full property lookup as well as the temporary
wrapper object creation.
Here's the conclusion:
Based on the benchmarks described above, we have seen a number of ways
about how subtle differences in our string declarations can produce a
series of different performance results. It is recommended that you
continue to declare your string variables as you normally do, unless
there is a very specific reason for you to create instances of the
String Object. Also, note that a browser’s overall performance,
particularly when dealing with the DOM, is not only based on the
page’s JS performance; there is a lot more in a browser than its JS
engine.

How is the variable sized JavaScript string a primitive type?

From what I understand, the (basic) string type in JavaScript is a primitive type, meaning its variables are allocated on the stack.
I would have thought that for a type to be allocatable on the stack, it needed to have a fixed size -- something which presumably holds true for the other primitive types like boolean, number, etc.
Am I somehow wrong to assume that, or is some other internal magic used to make strings in JavaScript primitive types?
EDIT:
This gets more complicated when one considers that JavaScript is loosely typed. Which makes me wonder how any local variable can be allocated on the stack.... given that the size of what might be assigned to it during the course of a function is not fixed.
But I guess (a perhaps simplified) answer to this might be that all local variables could be assigned a fixed maximum size on the stack. Say this is 8 bytes which I think is the size of the number type, and should be large enough to accommodate all the other primitive types (except the string) as well as memory addresses (for when a local variable is assigned a reference type). But, surely strings cannot be limited to 8 bytes (or any size for that matter). Which makes me conclude that strings (even the primitive type ones) are not (cannot be) assigned on the stack. And hence the term "Primitive type" in JavaScript is used to mean a "basic/building block" type, rather than one which is necessarily allocated on the stack (contradicting what I have read in numerous sources including the book "Professional JavaScript..." by Nicholas Zakas).
Anyone have any other take or a pointer to a good source talking about this?

A string is a both an object and a primitive.
When doing:
var s = "this is a string";
you actually do:
var s = new string("this is a string");
behind the curtains.
The first being a primitive array with characters, on which the second one refers.
Strings are immutable, meaning they can't be changed. If you try to change it (i.e. reverse it), you will create a new string primitive, on which the object reference will point to.

The storage used to represent variables in the Javascript interpreter need not look anything like a stack - it's implementation dependent.

Strings aren't allocated on the stack
where a variable is allocated doesn't distinguish primitives from objects
Strings aren't primitives, they are of the class "string"
The difference between a primitive and a type is that types have methods and you can assign new properties to them:
var a = 1, b = {}, s = '';
a.foo = 1; // doesn't work, but no error either
b.foo = 1; // works
s.foo = 1; // doesn't work, but no error either
console.log(a.foo);
console.log(b.foo);
console.log(s.foo);

gives
undefined
1
undefined
So all in all, I'm not sure that using "primitive" makes sense in JavaScript since the line is blurred.
A string is a "value object" which means you can't change any of the properties. For example, when you replace characters in a string, you get a new string; the old string doesn't change.

Develop Reference

JavaScript is the programming language of the Web.