Are all Blockchain array implementations incorrect?

Are all Blockchain array implementations incorrect? - javascript

I found many Blockchain implementations on the web, but are they true Blockchain that can scale?
Here we can see that the blockchain is started as an array
var blockchain = [getGenesisBlock()];
Here we can see the same implementation:
constructor() {
this.chain = [this.createGenesis()];
}
This article also recommends it:
constructor(genesisNode) {
this.chain = [this.createGenesisBlock()];
However, are any of these implementations ready to scale?
Technically, according to maerics,
the maximum length of an array according to the ECMA-262 5th Edition
specification is bound by an unsigned 32-bit integer due to the
ToUint32 abstract operation, so the longest possible array could have
232-1 = 4,294,967,295 = 4.29 billion elements.
The size is not a problem. Ethereum has used'only' 7 millions blocks, Bitcoin 'only' 500k, therefore there is enough space for the future. The real problem that I'm thinking is, how long would it take to read the last element of the array and would this be scalable?
In blockchain, the 'Block' structure always needs to read the hash of the last block, therefore I assume that as it scales it takes longer and longer to do it.
What would Bitcoin and/or Ethereum do if their Blockchain array of Blocks doesn't have any more space to store blocks? Would the Blockchain just end there?

The scalability problem comes from the cost of validating transactions and reaching consensus between the nodes. So it's not the cost of accessing a certain block that's problematic here.
Blockchains are not arrays. Conceptually think of it more like a Linked List
There is no limit to the number of blocks (there is one for the number of coins though). The space to store those blocks is also not limited.
To answer the question
Yes, all the implementations given in the question are incorrect/insufficient for a blockchain to work. For some implementations you can refer to the Bitcoin's repository or Ethereum's

Related

How are arrays implemented in JavaScript? What happened to the good old lists?

JavaScript provides a variety of data structures to be used ranging from simple objects over arrays, sets, maps, the weak variants as well as ArrayBuffers.
Over the half past year I found myself in the spot to recreate some of the more common structures like Dequeues, count maps and mostly different variants of trees.
While looking at the Ecma specification I could not find a description on how arrays implemented on a memory level, supposedly this is up to the underlying engine?
Contrary to languages I am used to, arrays in JavaScript have a variable length, similar to list. Does that mean that elements are not necessarily aligned next to each other in memory? Does a splice push and pop actually result in new allocation if a certain threshold is reached, similar to for example ArrayLists in Java? I am wondering if arrays are the way to go for queues and stacks or if actual list implementations with references to the next element might be suited in JavaScript in some cases (e.g. regarding overhead opposed to the native implementation of arrays?).
If someone has some more in-depth literature, please feel encouraged to link them here.

While looking at the Ecma specification I could not find a description on how arrays implemented on a memory level, supposedly this is up to the underlying engine?
The ECMAScript specification does not specify or require a specific implementation. That is up to the engine that implements the array to decide how best to store the data.
Arrays in the V8 engine have multiple forms based on how the array is being used. A sequential array with no holes that contains only one data type is highly optimized into something similar to an array in C++. But, if it contains mixed types or if it contains holes (blocks of the array with no value - often called a sparse array), it would have an entirely different implementation structure. And, as you can imagine it may be dynamically changed from one implementation type to another if the data in the array changes to make it incompatible with its current optimized form.
Since arrays have indexed, random access, they are not implemented as linked lists internally which don't have an efficient way to do random, indexed access.
Growing an array may require reallocating a larger block of memory and copying the existing array into it. Calling something like .splice() to remove items will have to copy portions of the array down to the lower position.
Whether or not it makes more sense to use your own linked list implementation for a queue instead of an array depends upon a bunch of things. If the queue gets very large, then it may be faster to deal with the individual allocations of a list so avoid having to copy large portions of the queue around in order to manipulate it. If the queue never gets very large, then the overhead of a moving data in an array is small and the extra complication of a linked list and the extra allocations involved in it may not be worth it.
As an extreme example, if you had a very large FIFO queue, it would not be particularly optimal as an array because you'd be adding items at one end and removing items from the other end which would require copying the entire array down to insert or remove an item from the bottom end and if the length changed regularly, the engine would probably regularly have to reallocate too. Whether or not that copying overhead was relevant in your app or not would need to be tested with an actual performance test to see if it was worth doing something about.
But, if your queue was always entirely the same data type and never had any holes in it, then V8 can optimize it to a C++ style block of memory and when calling .splice() on that to remove an item can be highly optimized (using CPU block move instructions) which can be very, very fast. So, you'd really have to test to decide if it was worth trying to further optimize beyond an array.
Here's a very good talk on how V8 stores and optimizes arrays:
Elements Kinds in V8
Here are some other reference articles on the topic:
How do JavaScript arrays work under the hood
V8 array source code
Performance tips in V8
How does V8 optimize large arrays

What is the size (in memory) of a number?

What is the size of a number in JavaScript?
For example, I know a single char in C is 1 byte. The size of an int is sizeof(int). The size on an int64_t is 64 bits, and so on.
What (and how to find it) is the size of a number (decimal, float) in JavaScript?

You can't determine memory size of number value size in JS. It is engine-specific and can be different between different values in same engine. ECMAScript standard (e.g. ECMA-262) only defines observable behavior of numbers, but as long as behavior matches specification in the end, different JS VMs use all kinds of different number types under the hood for optimization purposes.
Standard sets no limits on what engines can use and defines no method to retrieve those implementation details. Nor any other part of spec relies on anything except observable behavior again. You can check out engine-specific details in its documentation or try engine-specific internals debugging tools, but you can't get this size data from JS code itself.

Aside from what's mentioned in other answers, the reality is that modern engines use various optimizations, including storing numbers in various different methods (types...) depending on usage. This is one of the main ideas behind things like asm.js, and just to provide a simple example:
var i = 0;
while(i < 5) {
console.log('hello');
i++;
}
The engine can infer that i is an integer and optimize it's usage.

How would you explain Javascript Typed Arrays to someone with no programming experience outside of Javascript?

I have been messing with Canvas a lot lately, developing some ideas I have for a web-based game. As such I've recently run into Javascript Typed Arrays. I've done some reading for example at MDN and I just can't understand anything I'm finding. It seems most often, when someone is explaining Typed Arrays, they use analogies to other languages that are a little beyond my understanding.
My experience with "programming," if you can call it that (and not just front-end scripting), is pretty much limited to Javascript. I do feel as though I understand Javascript pretty well outside of this instance, however. I have deeply investigated and used the Object.prototype structure of Javascript, and more subtle factors such as variable referencing and the value of this, but when I look at any information I've found about Typed Arrays, I'm just lost.
With this frame-of-reference in mind, can you describe Typed Arrays in a simple, usable way? The most effective depicted use-case, for me, would be something to do with Canvas image data. Also, a well-commented Fiddle would be most appreciated.

In typed programming languages (to which JavaScript kinda belongs) we usually have variables of fixed declared type that can be dynamically assigned values.
With Typed Arrays it's quite the opposite.
You have a fixed chunk of data (represented by ArrayBuffer) that you do not access directly. Instead this data is accessed by views. Views are created at run time and they effectively declare some portion of the buffer to be of a certain type. These views are sub-classes of ArrayBufferView. The views define the certain continuous portion of this chunk of data as elements of an array of a certain type. Once the type is declared browser knows the length and content of each element, as well as a number of such elements. With this knowledge browsers can access individual elements much more efficiently.
So we dynamically assigning a type to a portion of what actually is just a buffer. We can assign multiple views to the same buffer.
From the Specs:
Multiple typed array views can refer to the same ArrayBuffer, of different types,
lengths, and offsets.
This allows for complex data structures to be built up in the ArrayBuffer.
As an example, given the following code:
// create an 8-byte ArrayBuffer
var b = new ArrayBuffer(8);
// create a view v1 referring to b, of type Int32, starting at
// the default byte index (0) and extending until the end of the buffer
var v1 = new Int32Array(b);
// create a view v2 referring to b, of type Uint8, starting at
// byte index 2 and extending until the end of the buffer
var v2 = new Uint8Array(b, 2);
// create a view v3 referring to b, of type Int16, starting at
// byte index 2 and having a length of 2
var v3 = new Int16Array(b, 2, 2);
The following buffer and view layout is created:
This defines an 8-byte buffer b, and three views of that buffer, v1,
v2, and v3. Each of the views refers to the same buffer -- so v1[0]
refers to bytes 0..3 as a signed 32-bit integer, v2[0] refers to byte
2 as a unsigned 8-bit integer, and v3[0] refers to bytes 2..3 as a
signed 16-bit integer. Any modification to one view is immediately
visible in the other: for example, after v2[0] = 0xff; v21 = 0xff;
then v3[0] == -1 (where -1 is represented as 0xffff).
So instead of declaring data structures and filling them with data, we take data and overlay it with different data types.

I spend all my time in javascript these days, but I'll take a stab at quick summary, since I've used typed arrays in other languages, like Java.
The closest thing I think you'll find in the way of comparison, when it comes to typed arrays, is a performance comparison. In my head, Typed Arrays enable compilers to make assumptions they can't normally make. If someone is optimizing things at the low level of a javascript engine like V8, those assumptions become valuable. If you can say, "Data will always be of size X," (or something similar), then you can, for instance, allocate memory more efficiently, which lets you (getting more jargon-y, now) reduce how many times you go to access memory and it's not in a CPU cache. Accessing CPU cache is much faster than having to go to RAM, I believe. When doing things at a large scale, those time savings add up quick.
If I were to do up a jsfiddle (no time, sorry), I'd be comparing the time it takes to perform certain operations on typed arrays vs non-typed arrays. For example, I imagine "adding 100,000 items" being a performance benchmark I'd try, to compare how the structures handle things.
What I can do is link you to: http://jsperf.com/typed-arrays-vs-arrays/7
All I did to get that was google "typed arrays javascript performance" and clicked the first item (I'm familiar with jsperf, too, so that helped me decide).

Information heap size

What information can I obtain from the performance.memory object in Chrome?
What do these numbers mean? (are they in kb's or characters)
What can I learn from these numbers?
Example values of performance.memory
MemoryInfo {
jsHeapSizeLimit: 793000000,
usedJSHeapSize: 10000000,
totalJSHeapSize: 31200000
}

What information can I obtain from the performance.memory object in Chrome?
The property names should be pretty descriptive.
What do these numbers mean? (are they in kb's or characters)
The docs state:
The values are quantized as to not expose private information to
attackers.
See the WebKit Patch for how the quantized values are exposed. The
tests in particular help explain how it works.
What can I learn from these numbers?
You can identify problems with memory management. See http://www.html5rocks.com/en/tutorials/memory/effectivemanagement/ for how the performance.memory API was used in gmail.

The related API documentation does not say, but my read judging by the numbers you shared and what I see on my machine is that the values are in bytes.
A quick review of the code to which Bergi linked - regarding the values being quantized - seems to support this - e.g. float sizeOfNextBucket = 10000000.0; // First bucket size is roughly 10M..
The quantized MemoryInfo properties are mostly useful for monitoring vs. determining the precise impact of operations on memory. A comment in the aforementioned linked code explains this well I think:
86 // We quantize the sizes to make it more difficult for an attacker to see precise
87 // impact of operations on memory. The values are used for performance tuning,
88 // and hence don't need to be as refined when the value is large, so we threshold
89 // at a list of exponentially separated buckets.
Basically the values get less precise as they get bigger but are still sufficiently precise for monitoring memory usage.

How to pre-allocate a dense array in Javascript?

When using the new Array(size) ctor, if size is not a constant, JS seems to create a sparse array in some browsers (at least in Chrome), causing access to be much slower than when using the default ctor, as shown here.
That is exactly the opposite of what I want: I pre-allocate an array of given size to avoid dynamic re-allocation and thereby improving performance. Is there any way to achieve that goal?
Please note that this question is not about the ambiguity of the new Array(size) ctor. I posted a recommendation on that here.

100000 is 1 past the pre-allocation threshold, 99999 still pre-allocates and as you can see it's much faster
http://jsperf.com/big-array-initialize/5

Pre-allocated vs. dynamically grown is only part of the story. For
preallocated arrays, 100,000 just so happens to be the threshold where
V8 decides to give you a slow (a.k.a. "dictionary mode") array.
Also, growing arrays on demand doesn't allocate a new array every time
an element is added. Instead, the backing store is grown in chunks
(currently it's grown by roughly 50% each time it needs to be grown,
but that's a heuristic that might change over time).
You can find more information ..here.Thanks ..:)

Develop Reference

JavaScript is the programming language of the Web.