JavaScript numbers, all the same size in memory? - javascript

I'm reading the Number Type section of the book Professional JavaScript for Web Developers. It seems to say that all ECMAScript numbers are binary64 floating point, which is corroborated by this MDN article. But the book author also says:
Because storing floating-point values uses twice as much memory as storing integer values, ECMAScript always looks for ways to convert values into integers.
I expected numbers to each occupy the same amount of memory: 64 bits. And the MDN article says, "There is no specific type for integers". Anyone know what the book author meant? How can integers take up less memory when they're stored as 64-bit floats (if I have that right)? You'll find the whole section at the link above (free sample of the book).

JavaScript doesn't have any other number type than double precision floating point (except in ECMAScript 6 typed arrays), but the underlying implementation may choose to store the numbers in any way it likes as long as the JavaScript code behaves the same.
JavaScript is compiled nowadays, which means that it can be optimised in many ways that are not obvious in the language.
If a local variable in a function only ever takes on an integer value and isn't exposed outside the function in any way, then it could actually be implemented using an integer type when the code is compiled.
The implementation varies in different browsers. Currently it seems to make a huge difference in MS Edge, a big difference in Firefox, and no difference at all in Chrome: http://jsperf.com/int-vs-double-implementation (Note: jsperf thinks that MS Edge is Chrome 42.)
Further research:
The JS engines Spidermonkey (Firefox), V8 (Chrome, Opera), JavaScriptCore (Safari), Chakra (IE) and Rhino (and possibly others, but those are harder to find implementation details about) use different ways of using integer types or storing numbers as integers when possible. Some quotes:
"To have an efficient representation of numbers and JavaScript
objects, V8 represents both of us with a 32 bits value. It uses a bit
to know if it is an object (flag = 1) or an integer (flag = 0) called
here SMall Integer or SMI because of its 31 bits."
http://thibaultlaurens.github.io/javascript/2013/04/29/how-the-v8-engine-works/
"JavaScript does not have a built-in notion of an integer value, but
for efficiency JavaScriptCore will represent most integers as int32
rather than as double."
http://trac.webkit.org/wiki/JavaScriptCore
"[...] non-double values are a 32-bit type tag and a 32-bit payload,
which is normally either a pointer or a signed 32-bit integer."
https://developer.mozilla.org/en-US/docs/Mozilla/Projects/SpiderMonkey/Internals
"In Windows 10 and Microsoft Edge, we’ve started optimizing Chakra’s
parser and the JIT compiler to identify non const variable
declarations of integers that are defined globally and are never
changed during the course of the execution time of the program."
https://blogs.windows.com/msedgedev/2015/05/20/delivering-fast-javascript-performance-in-microsoft-edge/

Because storing floating-point values uses twice as much memory as storing integer values, ECMAScript always looks for ways to convert values into integers.
This paragraph is complete nonsense. Ignore it!
Numbers are numbers. ECMAScript makes no distinction whatsoever between floating-point and integer numeric values.
Even within most JS runtimes, all numeric values are stored as double-precision floating point.

Not sure I fully understood your question, but "There is no specific type for integers" means that JavaScript doesn't recognize separate types for integers and floats, but they are both typed as Numbers. The int/float separation happens "behind the curtains", and that's what they meant by "ECMAScript always looks for ways to convert values into integers".
The bottom line is that you don't have to worry about it, unless you specifically need your variables to mimic integers or floats for use in other languages, in which case it's probably (did I say probably?) the best to pass them as strings (because you'd have trouble passing, say, 5.0 as a float because JS would immediatelly convert it to 5, exactly because of the "ECMAScript always looks for ways to convert values into integers" part).
alert(5.0); // don't expect a float from this

Related

Is an array of ints actually implemented as an array of ints in JavaScript / V8?

There is claim in this article that an array of ints in JavaScript is implemented by a C++ array of ints.
However; According to MDN unless you specifically use BigInts, in JavaScript all numbers are repressed as doubles.
If I do:
cont arr = [0, 1, 2, 3];
What is the actual representation in the V8 engine?
The code for V8 is here on github, but I don't know where to look:
(V8 developer here.)
"C++ array of ints" is a bit of a simplification, but the key idea described in that article is correct, and an array [0, 1, 2, 3] will be stored as an array of "Smis".
What's a "Smi"? While every Number in JavaScript must behave like an IEEE754 double, V8 internally represents numbers as "small integer" (31 bits signed integer value + 1 bit tag) when it can, i.e. when the number has an integral value in the range -2**30 to 2**30-1, to improve efficiency. Engines can generally do whatever they want under the hood, as long as things behave as if the implementation followed the spec to the letter. So when the spec (or MDN documentation) says "all Numbers are doubles", what it really means from the engine's (or an engine developer's) point of view is "all Numbers must behave as if they were doubles".
When an array contains only Smis, then the array itself keeps track of that fact, so that values loaded from such arrays know their type without having to check. This matters e.g. for a[i] + 1, where the implementation of + doesn't have to check whether a[i] is a Smi when it's already known that a is a Smi array.
When the first number that doesn't fit the Smi range is stored in the array, it'll be transitioned to an array of doubles (strictly speaking still not a "C++ array", rather a custom array on the garbage-collected heap, but it's similar to a C++ array, so that's a good way to explain it).
When the first non-Number is stored in an array, what happens depends on what state the array was in before: if it was a "Smi array", then it only needs to forget the fact that it contains only Smis. No rewriting is needed, as Smis are valid object pointers thanks to their tag bit. If the array was a "double array" before, then it does have to be rewritten, so that each element is a valid object pointer. All the doubles will be "boxed" as so-called "heap numbers" (objects on the managed heap that only wrap a double value) at this point.
In summary, I'd like to point out that in the vast majority of cases, there's no need to worry about any of these internal implementation tricks, or even be aware of them. I certainly understand your curiosity though! Also, array representations are one of the more common reasons why microbenchmarks that don't account for implementation details can easily be misleading by suggesting results that won't carry over to a larger app.
Addressing comments:
V8 does sometimes even use int16 or lower.
Nope, it does not. It may or may not start doing so in the future; though if anything does change, I'd guess that untagged int32 is more likely to be introduced than int16; also if anything does change about the implementation then of course the observable behavior would not change.
If you believe that your application would benefit from int16 storage, you can use an Int16Array to enforce that, but be sure to measure whether that actually benefits you, because quite likely it won't, and may even decrease performance depending on what your app does with its arrays.
It may start to be a double when you make it a decimal
Slightly more accurately: there are several reasons why an array of Smis needs to be converted to an array of doubles, such as:
storing a fractional value in it, e.g. 0.5
storing a large value in it, e.g. 2**34
storing NaN or Infinity or -0 in it

How does V8 store integers like 5?

How does V8 store integers in memory?
For example the integer 5?
I know it stores it the heap, but how exactly does it store it?
Things like metadata and the actual value itself.
Is there a constant added to the int before storing it?
V8 uses a pointer tagging scheme to distinguish small integers and heap object pointers. 5 would be stored as a Smi type, which is not heap allocated in V8.
You can check out the source code for the Smi class to learn more.
On 32-bit platforms, Smis are a 31 bit signed int with a 0 set for the bottom bit.
On 64-bit platforms, Smis are a 32 bit signed int, 31 bits of 0 padding and a 0 for the bottom bit.
Pointers to heap objects have a 1 set for the bottom bit so that V8 can tell the difference between pointers and Smis without extra metadata.
In Javascript, all numbers are stored as 64bit floating point values. C and C++ call this type double. There is no distinct "integer" type.
To some degree, you can use integer values naivly and get the result you expect, without having to fear rounding errors. These integers are so called "safe" integers.
All integers in the range [-(2^53 - 1), +(2^53 - 1)] are "safe" integers, as described here. This means that if you add, subtract or multiply integers in that range, and the result is within that range too, then the calculation is without rounding errors.
Of course, all values in Javascript/V8 are somehow "boxed", because a variable doesn't have a type (except small integers which use tagged pointers). If you have a variable x that is 5.25, it has to know that it is a "number" and that that number is 5.25. So it will take more than 8 bytes of space. You will have to look up the source code of v8 to find out more.

Can precision of floating point numbers in Javascript be a source of non determinism?

Can the same mathematical operation return different results in different architectures or browsers ?
The other answers are incorrect. According to the ECMAScript 5.1 specs (section 15.8.2)
NOTE The behaviour of the functions acos, asin, atan, atan2, cos, exp,
log, pow, sin, sqrt, and tan is not precisely specified here except
to require specific results for certain argument values that represent
boundary cases of interest.
...
Although the choice of algorithms is
left to the implementation, it is recommended (but not specified by
this standard) that implementations use the approximation algorithms
for IEEE 754 arithmetic contained in fdlibm, the freely distributable
mathematical library from Sun Microsystems
However, even if the implementations were specified, the exact results of all floating-point operations would still be dependent on browser/architecture. That includes simple operations like multiplication and division!!
The reason is that IEEE-754 allows systems to do 64-bit floating-point calculations at a higher-precision than the result, leading to different rounding results than systems which use the same precision as the result. This is exactly what the x86 (Intel) architecture does, which is why in C (and javascript) we can sometimes have cos(x) != cos(y) even though x == y, even on the same machine!
This is a big issue for networked peer-to-peer games, since this means, if the higher-precision calculations can't be disabled (as is the case for C#), those games pretty much can't use floating-point calculations at all. However, this is typically not an issue for Javascript games, since they are usually client-server.
If we assume that every browser vendor follows the IEEE standards + ECMA specs and there is no human error while implementing, no there can't be any difference.
Although the ECMAScript language specification 5.1 edition states that numbers are primitive values corresponding to IEEE 754 floats, which implies calculations should be consistent:
http://www.ecma-international.org/publications/files/ecma-st/ECMA-262.pdf
4.3.19 Number value
primitive value corresponding to a double-precision 64-bit binary format IEEE 754 value
NOTE
A Number value is a member of the Number type and is a direct representation of a number.
As BlueRaja points out, there is a sort of caveat in section 15.8.2:
The behaviour of the functions acos, asin, atan, atan2, cos, exp, log,
pow, sin, sqrt, and tan is not precisely specified here...
Meaning, these are at least some cases where the outcome of operations on numbers is implementation dependent and may therefore be inconsistent.
My two cents - #goldilocks notes and others allude to that you shouldn't use == or != on floating point numbers. So what do you mean by "deterministic"? That the behavior is always the same on different machines? Obviously this depends on what you mean by "the same behavior."
Well, at one silly literal level of "the same," of course not, physical bits will be different on e.g. 32 bit versus 64 bit machines. So that interpretation is out.
Ok, so will any program run with the same output on two different machines? In general languages no, because a C program can do something with undefined memory, like read from an uninitialized bit.
Ok, so will any valid program do the same thing on different machines? Well, I would say a program that uses == and != on floating point numbers is as invalid as a program that reads uninitialized memory. I personally don't know if the Javascript standard hammers out the behavior of == and != on floats to the point that it's well-defined if not kooky, so if that is your precise question you'll have to see the other answers. Can you write javascript code that has undefined output with respect to the standard? Never read the standard (other answers cover this somewhat), but for my interest this is moot because the programs that would produce what you call undeterministic behavior are invalid to begin with.

JavaScript 64 bit numeric precision

Is there a way to represent a number with higher than 53-bit precision in JavaScript? In other words, is there a way to represent 64-bit precision number?
I am trying to implement some logic in which each bit of a 64-bit number represents something. I lose the lower significant bits when I try to set bits higher than 2^53.
Math.pow(2,53) + Math.pow(2,0) == Math.pow(2,53)
Is there a way to implement a custom library or something to achieve this?
Google's Closure library has goog.math.Long for this purpose.
The GWT team have added a long emulation support so java longs really hold 64 bits. Do you want 64 bit floats or whole numbers ?
I'd just use either an array of integers or a string.
The numbers in javascript are doubles, I think there is a rounding error involved in your equation.
Perhaps I should have added some technical detail. Basically the GWT long emulation uses a tuple of two numbers, the first holding the high 32 bits and the second the low 32 bits of the 64 bit long.
The library of course contains methods to add stuff like adding two "longs" and getting a "long" result. Within your GWT Java code it just looks like two regular longs - one doesn't need to fiddle or be aware of the tuple. By using this approach GWT avoids the problem you're probably alluding to, namely "longs" dropping the lower bits of precision which isn't acceptable in many cases.
Whilst floats are by definition imprecise / approximations of a value, a whole number like a long isn't. GWT always holds a 64 bit long - maths using such longs never use precision. The exception to this is overflows but that accurately matches what occurs in Java etc when you add two very large long values which require more than 64 bits - eg 2^32-1 + 2^32-1.
To do the same for floating point numbers will require a similar approach. You will need to have a library that uses a tuple.
The following code might work for you; I haven't tested it however yet:
BigDecimal for JavaScript
Yes, 11 bit are reserved for exponent, only 52 bits containt value also called fraction.
Javascript allows bitwise operations on numbers but only first 32 bits are used in those operations according to Javascript standard specification.
I do not understand misleading GWT/Java/long answers in Javascript/double question though? Javascript is not Java.
Why would anyone need 64 bit precision in javascript ?
Longs sometimes hold ID of stuff in a DB so its important not to lose some of the lower bits... but floating point numbers are most of the time used for calculations. To use floats to hold monetary or similar exacting values is plain wrong. If you truely need 64 bit precision do the maths on the server where its faster and so on.

JavaScript Endian Encoding?

A response on SO got me thinking, does JavaScript guarantee a certain endian encoding across OSs and browsers?
Or put another way are bitwise shifts on integers "safe" in JavaScript?
Shifting is safe, but your question is flawed because endianness doesn't affect bit-shift operations anyway. Shifting left is the same on big-endian and little-endian systems in all languages. (Shifting right can differ, but only due to interpretation of the sign bit, not the relative positions of any bits.)
Endianness only comes into play when you have the option of interpreting some block of memory as bytes or as larger integer values. In general, Javascript doesn't give you that option since you don't get access to arbitrary blocks of memory, especially not the blocks of memory occupied by variables. Typed arrays offer views of data in an endian-sensitive way, but the ordering depends on the host system; it's not necessarily the same for all possible Javascript host environments.
Endianness describes physical storage order, not logical storage order. Logically, the rightmost bit is always the least significant bit. Whether that bit's byte is the one that resides at the lowest memory address is a completely separate issue, and it only matters when your language exposes such a concept as "lowest memory address," which Javascript does not. Typed arrays do, but then only within the context of typed arrays; they still don't offer access to the storage of arbitrary data.
Some of these answers are dated, because endianness can be relevant when using typed arrays! Consider:
var arr32 = new Uint32Array(1);
var arr8 = new Uint8Array(arr32.buffer);
arr32[0] = 255;
console.log(arr8[0], arr8[1], arr8[2], arr8[3]);
When I run this in Chrome's console, it yields 255 0 0 0, indicating that my machine is little-endian. However, typed arrays use the system endianness by default, so you might see 0 0 0 255 instead if your machine is big-endian.
Yes, they are safe. Although you're not getting the speed benefits you might hope for since JS bit operations are "a hack".
ECMA Script does actually have a concept of an integer type but it is implicitly coerced to or from a double-precision floating-point value as necessary (if the number represented is too large or if it has a fractional component).
Many mainstream Javascript interpreters (SpiderMonkey is an example) take a shortcut in implementation and interpret all numeric values as doubles to avoid checking the actual native type of the value for each instruction. As a result of the implementation hack, bit operations are implemented as a cast to an integral type followed by a cast back to a double representation. It is therefore not a good idea to use bit-level operations in Javascript and you won't get a performance boost anyway.
are bitwise shifts on integers "safe" in JavaScript?
Only for integers that fit within 32 bits (31+sign). Unlike, say, Python, you can't get 1<<40.
This is how the bitwise operators are defined to work by ECMA-262, even though JavaScript Numbers are actually floats. (Technically, double-precision floats, giving you 52 bits of mantissa, easily enough to cover the range of a 32-bit int.)
There is no issue of 'endianness' involved in bitwise arithmetic, and no byte-storage format where endianness could be involved is built into JavaScript.
JavaScript doesn't have an integer type, only a floating point type. You can never get close enough to the implementation details to worry about this.

Categories

Resources