Cast vs ToXXX for value handles in v8

Cast vs ToXXX for value handles in v8 - javascript

I'm embedding V8 as an auxiliary language in a C++ program.
I retrieve a Handle<Value> from V8 when I call something like
Handle<Value> value_handle = context->Global()->Get(key_handle);
I can then find out that it is (say) a string with value_handle->IsString(). And if so I can convert it to a Handle<String> to access its string-specific methods.
But there seems to be two ways of doing that, either:
Handle<String> string = value_handle->ToString();
or
Handle<String> string = Handle<String>::Cast(value_handle);
However, for Arrays and Functions, there is no toArray() or toFunction methods, just the casting.
So my question is:
a) are the ToXXX just syntactic sugar for casting?
and, if not b) what is the ToXXX method doing?

ToXXX functions perform type coercions as described in subsections of section 9 of ECMA-262 5th. For example ToString is described in section 9.8: when given a non-string value it'll return an appropriate string representation of it, if you are passing object it'll call toString method on it (or valueOf if toString is not present). Relevant code for ToString: in api.cc Value::ToString that calls into runtime.js ToString
On the other hand Handle<XXX>::Cast(...) does no coercions. It's just a type cast for handles. Essentially it is just a static_cast<XXX*>. In debug mode Handle<T>::Cast(...) is checked and aborts execution when types do not match. It would be a fatal error if you are given a Handle<Value> containing an Object and you are trying to cast it to a Handle<String>. In release mode casting to an incompatible type will just later lead to weird results and possibly crashes, when you try to use the result of the cast. Relevant code in v8.h Handle<T>::Cast which delegates to (for example) String::Cast which checks the cast (if checks are enabled) via String::CheckCast.

We can locate
V8EXPORT Local ToString() const;
in line 971 of v8.h
where V8EXPORT is a OS dependent approach for functions.
ToString of Handle of String is located at line 2362 of api.cc
Local<String> Value::ToString() const {
i::Handle<i::Object> obj = Utils::OpenHandle(this);
i::Handle<i::Object> str;
if (obj->IsString()) {
str = obj;
} else {
i::Isolate* isolate = i::Isolate::Current();
if (IsDeadCheck(isolate, "v8::Value::ToString()")) {
return Local<String>();
}
LOG_API(isolate, "ToString");
ENTER_V8(isolate);
EXCEPTION_PREAMBLE(isolate);
str = i::Execution::ToString(obj, &has_pending_exception);
EXCEPTION_BAILOUT_CHECK(isolate, Local<String>());
}
return Local<String>(ToApi<String>(str));
}
For consistency and take advantages from further upgrade of V8 versions, I strongly recommend to use toString() instead of the primitive cast.

Related

implementation of JS passing an argument of unknown type [duplicate]

I don't know why I never asked myself that questioned the last years before, but suddenly I could not find any answer for myself or with google.
Javascript is known to have no types for variables. A really nice thing.
But somehow it must determine the type and work with it.
var a = 1;
var b = 2.0;
var c = 'c';
var d = "Hello World!";
So what we have is an Integer, Double/Float, Character, String (which may be teared down as char*)
I know that JS works with a runtime interpreter, but thinking of that the logic and "type" must be implemented in any way..
So how does a Javascript Interpreter recognize and internally handle the variables?
In my imagination, assuming I would write C++, I would think of a sort of template and container and a bit of a logic that overloads operators and try to check, what it really is. But that's not thought to the end.
Share your knowledge with me please :-)

JavaScript sets the variable type based on the value assignment. For example when JavaScript encounters the following code it knows that myVariable should be of type number:
var myVariable = 10;
Similarly, JavaScript will detect in the following example that the variable type is string:
var myVariable = "Hello World!";
JavaScript is also much more flexible than many other programming languages. With languages such as Java a variable must be declared to be a particular type when it is created and once created, the type cannot be changed. This is referred to as strong typing. JavaScript, on the other hand, allows the type of a variable to be changed at any time simply by assigning a value of a different type (better known as loose typing).
The following example is perfectly valid use of a variable in JavaScript. At creation time, the variable is clearly of type number. A later assignment of a string to this variable changes the type from number to string.
var myVariable = 10;
myVariable = "This is now a string type variable";
The variable’s data type is the JavaScript scripting engine’s interpretation of the type of data that variable is currently holding. A string variable holds a string; a number variable holds a number value, and so on. However, unlike many other languages, in JavaScript, the same variable can hold different types of data, all within the same application. This is a concept known by the terms loose typing and dynamic typing, both of which mean that a JavaScript variable can hold different data types at different times depending on context.
Complete article here: http://www.techotopia.com/index.php/JavaScript_Variable_Types
Another Article Which may help you: http://oreilly.com/javascript/excerpts/learning-javascript/javascript-datatypes-variables.html
Useful links:
ECMAScript Language Specification
ECMAScript BNF Grammar
JAVAScript BNF Gramar

The only useful line I can find in the ES5 spec is this:
Within this specification, the notation “Type(x)” is used as shorthand for “the type of x” where “type” refers to the ECMAScript language and specification types defined in this clause.
I assume that when the runtime needs to perform an operation that needs to know the type of some value, it will check that value against the grammar defined in the spec for each type, until it finds a match.
For example, the grammer for a boolean literal is as follows:
BooleanLiteral ::
true
false
If the value is exactly true or exactly false (e.g. with no quotes) then that value is of type boolean.

JavaScript itself does have types, and internally, each assignment receives an appropriate type. In your example var foo = 2.0; the type will be float. The programmer needn't worry about that too much (at first) because JS is loosly typed (!== type-free). That means that if I were to compare a numeric string to a float, the engine will coerce the string to a number so that the comparison can be preformed.
The main difference between loose typed and strong typed languages is not type coercion, though. It's quite common in C(++) to cast to the type you need, and in some cases values are automatically converted to the correct type (2/2.0 == 2.0/2.0 == 1.0 ==> int is converted to float, implicitly). The main difference between loosly typed and strong typed languages is that you declare a variable with a distinct type:
int i = 0;//ok
//Later:
i = 'a';//<-- cannot assign char to int
whereas JS allows you to do:
var i = 1;//int
i = 1.123;//float
i = 'c';//char, or even strings
i = new Date();//objects
But, as functions/keywords like typeof, instanceof, parseFloat, parseInt,toString... suggest: there are types, they're just a tad more flexible. And variables aren't restricted to a single type.

One simple way to imagine an implementation is that all values are kept in objects and that all variables are pointers... in C++:
struct Value
{
int type;
Value(int type) : type(type) { }
virtual ~Value() { }
virtual std::string toString() = 0;
};
struct String : Value
{
std::string x;
String(const std::string& x) : Value(STRING_TYPE), x(x) { }
virtual std::string toString()
{
return x;
}
};
struct Number : Value
{
double x;
Number(double x) : Value(NUMBER_TYPE), x(x) { }
...
};
struct Object : Value
{
// NOTE: A javascript object is stored as a map from property
// names to Value*, not Value. The key is a string but
// the value can be anything
std::map<std::string, Value *> x;
Object() : Value(OBJECT_TYPE), x(x) { }
...
};
For example whenever an operation has to be performed (e.g. a+b) you need to check the types of the objects pointed by the variables to decide what to do.
Note that this is an ultra-simplistic explanation... javascript today is much more sophisticated and optimized than this, but you should be able to get a rough picture.

Why can I define an array literal as a parameter and get a Type Error in Javascript?

The internet, including Stackoverflow states that Javascript does not accept type specific parameters (one such article here). However, why does ES6 accept an array literal as the parameter for a function and when I pass a primitive it throws a Type Error?
I am having a hard time wrapping my head around what Javascript is doing in the background. I thought Javascript typically takes a variable name as the parameter in a function declaration and allocates memory for that name and assigns the value of whatever argument I pass to the parameter. I am not sure if this is exclusively in the Arguments Object or elsewhere also. In the example below, however, I do not have a variable name for the array literal. I just don't know how Javascript is interpreting this parameter.
In the code below I define a function using an array literal as the parameter and when I try to pass a primitive as an argument it produces a TypeError.
function box([width,height]) {
return `I have a box that is ${width} x ${height}`;
}
console.log(box([6,6])); //NO error
console.log(box(6)); //produces error, Webstorm says, "TypeError:
undefined is not a function"

This behavior is documented in the ES6 specification Destructuring assignments.
In the Runtime Semantics section, the definition of the array destructuring assignment is
ArrayAssignmentPattern : [ ]
Let iterator be GetIterator(value).
ReturnIfAbrupt(iterator).
Return IteratorClose(iterator, NormalCompletion(empty)).
It's pretty interesting to dig into it. The array destructuring assignment (so it is an assignment) expects an iterable, that is: an object for which obj[Symbol.iterator] is defined as a function returning an iterator. Try testing this in a console of a browser that supports it (tests done on Firefox 57). In fact when you do:
let [a,b] = 5 //smaller reproduction of the error
you will get TypeError: 5 is not iterable. Contrast with below:
let number = new Number(5);
number[Symbol.iterator] = function*() {
yield 5;
}
let [f,g] = number; // works, f === 5, g === undefined
Probably under the hood JS destructuring maps the named array indexes to a slot which represents the next iteration of the iterator. Numbers do not have a built-in [Symbol.iterator] property, hence the error

What is the difference between String and new String? [duplicate]

Taken from MDN
String literals (denoted by double or single quotes) and strings
returned from String calls in a non-constructor context (i.e., without
using the new keyword) are primitive strings. JavaScript automatically
converts primitives to String objects, so that it's possible to use
String object methods for primitive strings. In contexts where a
method is to be invoked on a primitive string or a property lookup
occurs, JavaScript will automatically wrap the string primitive and
call the method or perform the property lookup.
So, I thought (logically) operations (method calls) on string primitives should be slower than operations on string Objects because any string primitive is converted to string Object (extra work) before the method being applied on the string.
But in this test case, the result is opposite. The code block-1 runs faster than the code block-2, both code blocks are given below:
code block-1 :
var s = '0123456789';
for (var i = 0; i < s.length; i++) {
s.charAt(i);
}
code block-2 :
var s = new String('0123456789');
for (var i = 0; i < s.length; i++) {
s.charAt(i);
}
The results varies in browsers but the code block-1 is always faster. Can anyone please explain this, why the code block-1 is faster than code block-2.

JavaScript has two main type categories, primitives and objects.
var s = 'test';
var ss = new String('test');
The single quote/double quote patterns are identical in terms of functionality. That aside, the behaviour you are trying to name is called auto-boxing. So what actually happens is that a primitive is converted to its wrapper type when a method of the wrapper type is invoked. Put simple:
var s = 'test';
Is a primitive data type. It has no methods, it is nothing more than a pointer to a raw data memory reference, which explains the much faster random access speed.
So what happens when you do s.charAt(i) for instance?
Since s is not an instance of String, JavaScript will auto-box s, which has typeof string to its wrapper type, String, with typeof object or more precisely s.valueOf(s).prototype.toString.call = [object String].
The auto-boxing behaviour casts s back and forth to its wrapper type as needed, but the standard operations are incredibly fast since you are dealing with a simpler data type. However auto-boxing and Object.prototype.valueOf have different effects.
If you want to force the auto-boxing or to cast a primitive to its wrapper type, you can use Object.prototype.valueOf, but the behaviour is different. Based on a wide variety of test scenarios auto-boxing only applies the 'required' methods, without altering the primitive nature of the variable. Which is why you get better speed.

This is rather implementation-dependent, but I'll take a shot. I'll exemplify with V8 but I assume other engines use similar approaches.
A string primitive is parsed to a v8::String object. Hence, methods can be invoked directly on it as mentioned by jfriend00.
A String object, in the other hand, is parsed to a v8::StringObject which extends Object and, apart from being a full fledged object, serves as a wrapper for v8::String.
Now it is only logical, a call to new String('').method() has to unbox this v8::StringObject's v8::String before executing the method, hence it is slower.
In many other languages, primitive values do not have methods.
The way MDN puts it seems to be the simplest way to explain how primitives' auto-boxing works (as also mentioned in flav's answer), that is, how JavaScript's primitive-y values can invoke methods.
However, a smart engine will not convert a string primitive-y to String object every time you need to call a method. This is also informatively mentioned in the Annotated ES5 spec. with regard to resolving properties (and "methods"¹) of primitive values:
NOTE The object that may be created in step 1 is not accessible outside of the above method. An implementation might choose to avoid the actual creation of the object. [...]
At very low level, Strings are most often implemented as immutable scalar values. Example wrapper structure:
StringObject > String (> ...) > char[]
The more far you're from the primitive, the longer it will take to get to it. In practice, String primitives are much more frequent than StringObjects, hence it is not a surprise for engines to add methods to the String primitives' corresponding (interpreted) objects' Class instead of converting back and forth between String and StringObject as MDN's explanation suggests.
¹ In JavaScript, "method" is just a naming convention for a property which resolves to a value of type function.

In case of string literal we cannot assign properties
var x = "hello" ;
x.y = "world";
console.log(x.y); // this will print undefined
Whereas in case of String Object we can assign properties
var x = new String("hello");
x.y = "world";
console.log(x.y); // this will print world

String Literal:
String literals are immutable, which means, once they are created, their state can't be changed, which also makes them thread safe.
var a = 's';
var b = 's';
a==b result will be 'true' both string refer's same object.
String Object:
Here, two different objects are created, and they have different references:
var a = new String("s");
var b = new String("s");
a==b result will be false, because they have different references.

If you use new, you're explicitly stating that you want to create an instance of an Object. Therefore, new String is producing an Object wrapping the String primitive, which means any action on it involves an extra layer of work.
typeof new String(); // "object"
typeof ''; // "string"
As they are of different types, your JavaScript interpreter may also optimise them differently, as mentioned in comments.

When you declare:
var s = '0123456789';
you create a string primitive. That string primitive has methods that let you call methods on it without converting the primitive to a first class object. So your supposition that this would be slower because the string has to be converted to an object is not correct. It does not have to be converted to an object. The primitive itself can invoke the methods.
Converting it to an full-blown object (which allows you to add new properties to it) is an extra step and does not make the string oeprations faster (in fact your test shows that it makes them slower).

I can see that this question has been resolved long ago, there is another subtle distinction between string literals and string objects, as nobody seems to have touched on it, I thought I'd just write it for completeness.
Basically another distinction between the two is when using eval. eval('1 + 1') gives 2, whereas eval(new String('1 + 1')) gives '1 + 1', so if certain block of code can be executed both 'normally' or with eval, it could lead to weird results

The existence of an object has little to do with the actual behaviour of a String in ECMAScript/JavaScript engines as the root scope will simply contain function objects for this. So the charAt(int) function in case of a string literal will be searched and executed.
With a real object you add one more layer where the charAt(int) method also are searched on the object itself before the standard behaviour kicks in (same as above). Apparently there is a surprisingly large amount of work done in this case.
BTW I don't think that primitives are actually converted into Objects but the script engine will simply mark this variable as string type and therefore it can find all provided functions for it so it looks like you invoke an object. Don't forget this is a script runtime which works on different principles than an OO runtime.

The biggest difference between a string primitive and a string object is that objects must follow this rule for the == operator:
An expression comparing Objects is only true if the operands reference
the same Object.
So, whereas string primitives have a convenient == that compares the value, you're out of luck when it comes to making any other immutable object type (including a string object) behave like a value type.
"hello" == "hello"
-> true
new String("hello") == new String("hello") // beware!
-> false
(Others have noted that a string object is technically mutable because you can add properties to it. But it's not clear what that's useful for; the string value itself is not mutable.)

The code is optimized before running by the javascript engine.
In general, micro benchmarks can be misleading because compilers and interpreters rearrange, modify, remove and perform other tricks on parts of your code to make it run faster.
In other words, the written code tells what is the goal but the compiler and/or runtime will decide how to achieve that goal.
Block 1 is faster mainly because of:
var s = '0123456789'; is always faster than
var s = new String('0123456789');
because of the overhead of object creation.
The loop portion is not the one causing the slowdown because the chartAt() can be inlined by the interpreter.
Try removing the loop and rerun the test, you will see the speed ratio will be the same as if the loop were not removed. In other words, for these tests, the loop blocks at execution time have exactly the same bytecode/machine code.
For these types of micro benchmarks, looking at the bytecode or machine code wil provide a clearer picture.

we can define String in 3-ways
var a = "first way";
var b = String("second way");
var c = new String("third way");
// also we can create using
4. var d = a + '';
Check the type of the strings created using typeof operator
typeof a // "string"
typeof b // "string"
typeof c // "object"
when you compare a and b var
a==b ( // yes)
when you compare String object
var StringObj = new String("third way")
var StringObj2 = new String("third way")
StringObj == StringObj2 // no result will be false, because they have different references

In Javascript, primitive data types such is string is a non-composite building block. This means that they are just values, nothing more:
let a = "string value";
By default there is no built-in methods like toUpperCase, toLowerCase etc...
But, if you try to write:
console.log( a.toUpperCase() ); or console.log( a.toLowerCase() );
This will not throw any error, instead they will work as they should.
What happened ?
Well, when you try to access a property of a string a Javascript coerces string to an object by new String(a); known as wrapper object.
This process is linked to concept called function constructors in Javascript, where functions are used to create new objects.
When you type new String('String value'); here String is function constructor, which takes an argument and creates an empty object inside the function scope, this empty object is assigned to this and in this case, String supplies all those known built in functions we mentioned before. and as soon as operation is completed, for example do uppercase operation, wrapper object is discarded.
To prove that, let's do this:
let justString = 'Hello From String Value';
justString.addNewProperty = 'Added New Property';
console.log( justString );
Here output will be undefined. Why ?
In this case Javascript creates wrapper String object, sets new property addNewProperty and discards the wrapper object immediately. this is why you get undefined. Pseudo code would be look like this:
let justString = 'Hello From String Value';
let wrapperObject = new String( justString );
wrapperObject.addNewProperty = 'Added New Property'; //Do operation and discard

JavaScript Make a non-object variable with properties [duplicate]

I've been messing around with the ECMA-262 standard (ECMAScript Language Specification, 3rd edition, if it matters for this - I have not found any difference between the 3rd and 5th edition on String Type / String Object).
There's one thing that baffles me: the difference between the String Type and the String Object. Yes I know the difference in the sense that the String Type is a sequence of 16-bit UTF-16 units and the String Object is a built-in object with its internal Class property set to "String" and its internal Value property set to a value of the String Type.
But reading the specification, the string type does not seem to expose any methods; that is, it's just a value without any additional properties. Take this code, everything is exactly as expected:
document.writeln(typeof "foo"); // 'string'
document.writeln(typeof new String("foo")); // 'object'
The first type is the actual String Type and the second is the Object Type (it's an object of class String, but its data type is object). However, looking at this:
"foo".charAt(0);
fooStrObj = new String("Foo");
fooStrObj.charAt(0);
They both seem to expose the same functions, but there are no functions on the String Type defined in the ECMA-262 standard; all the functions it exposes are from the String.prototype object (and I can see no reference to the fact that the String Type magically exposes all the properties and functions of the String.prototype object in the ECMA-262 standard). So are the values of type String Type automatically promoted to a String Object with the original String Type value as its internal Value property?
And if they are treated exactly the same (which for all intents and purposes they seem to be), why have two different ways to represent a String?

Strings are a value type in JS, so they can't have any properties attached to them, no prototype, etc. Any attempt to access a property on them is technically performing the JS [[ToObject]] conversion (in essence new String).
Easy way of distinguishing the difference is (in a browser)
a = "foo"
a.b = "bar"
alert("a.b = " + a.b); //Undefined
A = new String("foo");
A.b = "bar";
alert("A.b = " + A.b); // bar
Additionally while
"foo" == new String("foo")
is true, it is only true due to the implicit type conversions of the == operator
"foo" === new String("foo")
will fail.

It's analogous to the difference between int and Integer in Java.
According to the standard, strings are automatically converted to String objects when you try to call a method. See ECMA 262-3 section 11.2.1; step 5 calls ToObject (which is defined in section 9.9).
11.2.1 Property Accessors
[...]
The production MemberExpression : MemberExpression [ Expression ] is evaluated as follows:
Evaluate MemberExpression.
Call GetValue(Result(1)).
Evaluate Expression.
Call GetValue(Result(3)).
Call ToObject(Result(2)).
Call ToString(Result(4)).
Return a value of type Reference whose base object is Result(5) and whose property name is Result(6).
9.9 ToObject
The operator ToObject converts its argument to a value of type Object according to the following table:
[...]
Create a new String object whose [[value]] property is set to the value of the
string. See 15.5 for a description of String objects.
As a specification technique, this is a hack to explain how strings can appear to have methods even though they're not really objects.
Apart from that, the wrapper objects are not very useful. I don't know why they're in the language. I rather wish they weren't. :)

Converting an int64 value to a Number object in JavaScript

I have a COM object which has a method that returns an unsigned int64 (VT_UI8) value. We have an HTML page which contains some JavaScript which can load the COM object and make the call to that method, to retrieve the value as such:
var foo = MyCOMObject.GetInt64Value();
This value can easily be displayed to the user in a message dialog using:
alert(foo);
or displayed on the page by:
document.getElementById('displayToUser').innerHTML = foo;
However, we cannot use this value as a Number (e.g. if we try to multiply it by 2) without the page throwing "Number expected" errors. If we check "typeof(foo)" it returns "unknown".
I've found a workaround for this by doing the following:
document.getElementById('displayToUser').innerHTML = foo;
var bar = parseInt(document.getElementById('displayToUser').innerHTML);
alert(bar*2);
What I need to know is how to make that process more efficient. Specifically, is there a way to cast foo to a String explicitly, rather than having to set some document element's innerHTML to foo and then retrieve it from that. I wouldn't mind calling something like:
alert(parseInt((string)foo) * 2);
Even better would be if there is a way to directly convert the int64 to a Number, without going through the String conversion, but I hold out less hope for that.

This:
alert(Number(String(foo)) * 2);
should do it (but see below), if your COM object implements toString (or valueOf with the "string" hint) correctly (and apparently it does, if your innerHTML trick works -- because when you assign foo to innerHTML, the same process of converting the COM object to a string occurs as with String(foo)).
From Section 15.5.1 of the 5th Edition ECMAScript spec:
When String is called as a function rather than as a constructor, it performs a type conversion.
And Section 15.7.1
When Number is called as a function rather than as a constructor, it performs a type conversion
It may be worth trying just Number(foo) * 2 to make sure, but I don't think it'll work (it seems like your COM object only handles conversion to String, not Number, which isn't surprising or unreasonable).
Edit If String(foo) is failing, try:
alert(Number("" + foo) * 2);
I'm very surprised that your innerHTML trick is working but String(foo) is throwing an error. Hopefully "" + foo will trigger the same implicit conversion as your innerHTML trick.
Edit Okay, this COM object is being very strange indeed. My next two salvos:
alert(("" + foo) * 2);
That uses all implicit conversions (adding an object to a string converts the object to a string; applying the * operator to a string converts it to a number).
Alternately, we can make the string->number conversion explicit but indirect:
alert(parseInt("" + foo) * 2);

Eek. Well if none of the explicit conversions are working because of the strange behaviour of the host object, let's try the implicit ones:
var n= +(''+foo);
I'm assuming you don't mind that the target type Number doesn't cover the full range of values of an int64 (it's a double, so you only get 52 bits of mantissa).

Matt, from the comments to other answers, I suspect you're running this code in some sort of loop. If so, make sure that you check the returned value for null before trying your conversions.
var foo = MyCOMObject.GetInt64Value();
if (foo == null) {
foo = 0; // Or something else
}

Develop Reference

JavaScript is the programming language of the Web.