Probably a silly question, but I was wondering what implementation is behind arrays in JavaScript? Are they SLL (Singly Linked List) or DLL (Doubly Linked List) or something different?
You can't really tell, it can vary between implementations (V8, Rhino, etc.), it depends on what the current compiler's developer preferred.
Depends on the browser. Why does this matter?
Could be just a hash table.
And yes - it was a silly question
Taken from V8 object.h:
// The JSArray describes JavaScript Arrays
// Such an array can be in one of two modes:
// - fast, backing storage is a FixedArray and length <= elements.length();
// Please note: push and pop can be used to grow and shrink the array.
// - slow, backing storage is a HashTable with numbers as keys.
class JSArray: public JSObject {
So in V8, it can be in the best case an C++ array allocated in the heap or a HashTable implementation, it depends on the interpreter and its optimization processes.
Javascript array is just simply Object.
Arrays are list-like objects whose prototype has methods to perform traversal and mutation operations.
From MDN - array
Edit
As the comment of #Alnitak, the above is nothing about implementation. The implementation of Object in JS depends on the Javascript engine.
Related
JavaScript provides a variety of data structures to be used ranging from simple objects over arrays, sets, maps, the weak variants as well as ArrayBuffers.
Over the half past year I found myself in the spot to recreate some of the more common structures like Dequeues, count maps and mostly different variants of trees.
While looking at the Ecma specification I could not find a description on how arrays implemented on a memory level, supposedly this is up to the underlying engine?
Contrary to languages I am used to, arrays in JavaScript have a variable length, similar to list. Does that mean that elements are not necessarily aligned next to each other in memory? Does a splice push and pop actually result in new allocation if a certain threshold is reached, similar to for example ArrayLists in Java? I am wondering if arrays are the way to go for queues and stacks or if actual list implementations with references to the next element might be suited in JavaScript in some cases (e.g. regarding overhead opposed to the native implementation of arrays?).
If someone has some more in-depth literature, please feel encouraged to link them here.
While looking at the Ecma specification I could not find a description on how arrays implemented on a memory level, supposedly this is up to the underlying engine?
The ECMAScript specification does not specify or require a specific implementation. That is up to the engine that implements the array to decide how best to store the data.
Arrays in the V8 engine have multiple forms based on how the array is being used. A sequential array with no holes that contains only one data type is highly optimized into something similar to an array in C++. But, if it contains mixed types or if it contains holes (blocks of the array with no value - often called a sparse array), it would have an entirely different implementation structure. And, as you can imagine it may be dynamically changed from one implementation type to another if the data in the array changes to make it incompatible with its current optimized form.
Since arrays have indexed, random access, they are not implemented as linked lists internally which don't have an efficient way to do random, indexed access.
Growing an array may require reallocating a larger block of memory and copying the existing array into it. Calling something like .splice() to remove items will have to copy portions of the array down to the lower position.
Whether or not it makes more sense to use your own linked list implementation for a queue instead of an array depends upon a bunch of things. If the queue gets very large, then it may be faster to deal with the individual allocations of a list so avoid having to copy large portions of the queue around in order to manipulate it. If the queue never gets very large, then the overhead of a moving data in an array is small and the extra complication of a linked list and the extra allocations involved in it may not be worth it.
As an extreme example, if you had a very large FIFO queue, it would not be particularly optimal as an array because you'd be adding items at one end and removing items from the other end which would require copying the entire array down to insert or remove an item from the bottom end and if the length changed regularly, the engine would probably regularly have to reallocate too. Whether or not that copying overhead was relevant in your app or not would need to be tested with an actual performance test to see if it was worth doing something about.
But, if your queue was always entirely the same data type and never had any holes in it, then V8 can optimize it to a C++ style block of memory and when calling .splice() on that to remove an item can be highly optimized (using CPU block move instructions) which can be very, very fast. So, you'd really have to test to decide if it was worth trying to further optimize beyond an array.
Here's a very good talk on how V8 stores and optimizes arrays:
Elements Kinds in V8
Here are some other reference articles on the topic:
How do JavaScript arrays work under the hood
V8 array source code
Performance tips in V8
How does V8 optimize large arrays
This question already has answers here:
Are JavaScript Arrays actually implemented as arrays?
(2 answers)
How are JavaScript arrays implemented?
(8 answers)
Closed 2 years ago.
I am new to JavaScript and lately, I found out that arrays in JavaScript are like lists in Java and that they can contain different types of variables.
My question is if in JavaScript an array are made of pointers? How is it possible to have different types in the same array, because we must define the array size before we assign the variables?
I have tried to find some information on Google, but all I have found are examples on arrays ):
You do not have to define the array size before you assign the variables. You can go like:
let array = [];
array.push(12);
array.push("asd");
array.push({data:5});
array.forEach(element => {
console.log(element);
});
Also I think you should not think about pointers with such a high level language. The better way is to look at variables like 'primitives' and 'objects'. Here is a good read about it:
https://developer.mozilla.org/en-US/docs/Web/JavaScript/Data_structures
High level languages, and in particular scripting languages, tend to reference most things with pointers, and they make pointer access transparent. Javascript does this also. Most everything, even primitives like numbers and strings, are objects. Objects in javascript have properties that store things. Those properties are essentially pointers, in that they are references to other objects. Arrays are implemented in the same way, and are in fact objects with numeric properties (and a few utility methods a standard object doesn't have, such as .length, .push(), .map(), etc.). Arrays don't hav a fixed size anymore than objects do. So everything in javascript is stored in these object "buckets" that can store anything in their properties (although you can seal objects, like numbers and strings, so that they don't accidentally change).
Languages with fixed data types (C like languages for instance) implement things with fixed data structures, and the exact size is easily calculable and known. When you declare a variable, the compiler uses the type of that variable to reserve some space in memory. Javascript handles all that for you and doesn't assume anything is a fixed size, because it can't. The size of javascript objects can change at any time.
In C-Like languages, when you ask for an array, you are asking for a block of a specific size. The compiler needs to know how big that is so that it can determine where in memory to put everything, and it can use the type of objects in the array to easily calculate that. Interpreted languages use pointers behind the scenes to keep track of where everything is stored, because they can't assume it will always be in the same place, like a compiled program can. (This is somewhat of a simplification and there are caveats to this of course).
https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Array
JavaScript is a loosely typed language, Therefore there is noting stoping you from having different types in javascript array. but I would strongly avoid structuring your data that way without static type-checking (Typescript)
const test = ['test', {test:'test'}, 1, true]
On the development of a compiler from a language very similar to JavaScript to C++, I need a way to represent data structures. JavaScript's main data structures are Arrays and Hash-Tables. Arrays are more straighforward: I can use a vector of untyped pointers. It needs to be a vector because JS arrays are dynamic, and of pointers because JS arrays can hold any kind of object, for example:
var array = [1,2,[3,4],"test"];
I can't see a way to represent this other than that (is there?). For the hashes, I could use something similar, except including the string hashing step on access.
The problem is: JavaScript hashes are JIT-compiled into actual C++ objects which probably are much faster than hashes. This way, I'm afraid my attempt to generate C++ like that will actually result in slower code than the JavaScript version!
Does that make sense?
What would be the best approach to my compiler?
If this is an AOT compiler you can only process the hash keys that you see at compile-time, obviously. In this case you can change hash accesses to known keys to array accesses, giving each known key a small integer as index.
I am using node.js as my server platform and I need to process a non sparse array of 65,000 items.
Javascript arrays are not true arrays, but actually hashes. Index access is accompagnied with conversion of the index to string and then doing a hash lookup. (see the Arrays section in http://www.crockford.com/javascript/survey.html).
So, my question is this. Does node.js implement a real array? The one that does cost us to resize or delete items, but with the true random access without any index-to-string-then-hash-lookup ?
Thanks.
EDIT
I may be asking for too much, but my array stores Javascript objects. Not numbers. And I cannot break it into many typed arrays, each holding number primitives or strings, because the objects have nested subobjects. Trying to use typed arrays will result in an unmaintainable code.
EDIT2
I must be missing something. Why does it have to be all or nothing? Either true Javascript with no true arrays or a C style extension with no Javascript benefits. Does having a true array of Javascript (untyped) objects contradicts the nature of Javascript in anyway? Java and C# have List<Object> which is essentially what I am looking for. C# even closer with List<DynamicObject>.
Node.js has the Javascript typed arrays: Int8Array, Uint8Array, Int16Array, Uint16Array, Int32Array, Uint32Array, Float32Array.
I think they are what you are asking for.
Node.js does offer a Buffer class that is probably what you're looking for:
A Buffer is similar to an array of integers but corresponds to a raw memory allocation outside the V8 heap. A Buffer cannot be resized.
Not intrinsically, no.
However depending on your level of expertise, you could write a "true" array extension using Node's C/C++ extension facility. See http://nodejs.org/api/addons.html
You want to use Low Level JavaScript (LLJS) to manipulate everything directly in C-style.
http://mbebenita.github.com/LLJS/
Notice that according to the link above, an LLJS array is more like the array you are looking for (true C-like array), rather than a Javascript array.
There is an implementation for LLJS in Node.js available , so maybe you do not have to write your own node.js C extension. Perhaps this implementation will do the trick: https://github.com/mbebenita/LLJS
Take the following code example:
var myObject = {};
var i = 100;
while (i--) {
myObject["foo"+i] = new Foo(i);
}
console.log(myObject["foo42"].bar());
I have a few questions.
What kind of data structure do the major engines (IE, Mozilla, Chrome, Safari) use for storing key-value pairs? I'd hope it's some kind Binary Search tree, but I think they may use linked lists (due to the fact iterating is done in insertion order).
If they do use a search tree, is it self balancing? Because the above code with a conventional search tree will create an unbalanced tree, causing worst case scenario of O(n) for searching, rather than O(log n) for a balanced tree.
I'm only asking this because I will be writing a library which will require efficient retrieval of keys from a data structure, and while I could implement my own or an existing red-black tree I would rather use native object properties if they're efficient enough.
The question is hard to answer for a couple reasons. First, the modern browsers all heavily and dynamically optimize code while it is executing so the algorithms chosen to access the properties might be different for the same code. Second, each engine uses different algorithms and heuristics to determine which access algorithm to use. Third, the ECMA specification dictates what the result of must be, not how the result is achieved so the engines have a lot of freedom to innovate in this area.
That said, given your example all the engines I am familiar with will use some form of a hash table to retrieve the value associated with foo42 from myobject. If you use an object like an associative array JavaScript engines will tend to favor a hash table. None that I am aware of use a tree for string properties. Hash tables are worst case O(N), best case O(1) and tend to be closer to O(1) than O(N) if the key generator is any good. Each engine will have a pattern you could use to get it to perform O(N) but that will be different for each engine. A balanced tree would guarantee worst case O(log N) but modifying a balanced tree while keeping it balanced is not O(log N) and hash tables are more often better than O(log N) for string keys and are O(1) to update (once you determine you need to, which is the same big-O as read) if there is space in the table (periodically O(N) to rebuild the table but the tables usually double in space which means you will only pay O(N) 7 or 8 times for the life of the table).
Numeric properties are special, however. If you access an object using integer numeric properties that have few or no gaps in range, that is, use the object like it is an array, the values will tend to be stored in a linear block of memory with O(1) access. Even if your access has gaps the engines will probably shift to a sparse array access which will probably be, at worst, O(log N).
Accessing a property by identifier is also special. If you access the property like,
myObject.foo42
and execute this code often (that is, the speed of this matters) and with the same or similar object this is likely to be optimized into one or two machine instructions. What makes objects similar also differs for each engine but if they are constructed by the same literal or function they are more likely to be treated as similar.
No engine that does at all well on the JavaScript benchmarks will use the same algorithm for every object. They all must dynamically determine how the object is being used and try to adjust the access algorithm accordingly.