Javascript memory optimization when array shifting

Javascript memory optimization when array shifting - javascript

I have in my project history of changes.
History is an array consisting of objects where each object has 2 arrays.
So when I add history snapshot it looks like this (but in reality i'm not adding empty arrays):
history.push({ //new history moment
firstP: [],
secondP: [],
})
And e.g. the array firstP consists of objects like this:
{
color: "red",
move: 1,
... and some other fields (max 14 fields if it matters)
}
firstP and secondP usually holds thousands of objects.
So each history snapshot is pretty heavy for memory.
So i added limit
const limitOfSteps = 50;
now after every push i check if length of history isn't greater than 50.
If it is i do history.shift();
But what i see in my memory is that even when shifting (removing first element in the array) used memory is increasing. The element is added to history when user do something in the react app so he can do as many changes as he wants to.
I know there is garbage collector but how does it work with arrays?
Shifting array should mean that the element is gone (and gone from memory too?)
But it's not gone immediately (If user will make changes quickly then the whole app
will be out of memory).
Changing the removed element (just before shifting the array) to undefined or null would make the memory free quicker?
Main goal is to use less memory... does anyone know how to?
Edit:
The array may be shifted even thousand times.
Edit2 (Maybe my question was all wrong? Maybe i should ask if when the whole array is removed ?)
It is all in the react app in the state.
Probably slicing the history (doing copy) is much more memory consuming, but it is inevitable because state is immutable.
My method to update looks something like this:
updateHistory = (newElement) => {
const history = this.state.history.slice();
history.push(newElement);
if(history.length - 1 > 50) history.shift();
this.setState({history: history});
}
Does it anything make sense?

Garbage collection is done automatically in JavaScript, it's not something that you have to manage yourself. This is also why you're seeing the memory increasing when shifting items from your array although you limitted the size to 50.
You can't force or prevent the garbage collection, but when it runs, it takes care of removing values without a reference to it.

An object is said to be "garbage", or collectible if there are zero references pointing to it.
https://developer.mozilla.org/en-US/docs/Web/JavaScript/Memory_Management
Associates to that approach you can try to remove variable reference. Example:
var a = 1;
var array = [a, 2, 3, 4];
a = null;
array.shift();
// or
var array = [1, 2, 3, 4];
array[0] = null;
array.shift();

Related

Is this O(N) approach the only way of avoiding a while loop when walking this linked list in Javascript?

I have a data structure that is essentially a linked list stored in state. It represents a stream of changes (patches) to a base object. It is linked by key, rather than by object reference, to allow me to trivially serialise and deserialise the state.
It looks like this:
const latest = 'id4' // They're actually UUIDs, so I can't sort on them (text here for clarity)
const changes = {
id4: {patch: {}, previous: 'id3'},
id3: {patch: {}, previous: 'id2'},
id2: {patch: {}, previous: 'id1'},
id1: {patch: {}, previous: undefined},
}
At some times, a user chooses to run an expensive calculation and results get returned into state. We do not have results corresponding to every change but only some. So results might look like:
const results = {
id3: {performance: 83.6},
id1: {performance: 49.6},
}
Given the changes array, I need to get the results closest to the tip of the changes list, in this case results.id3.
I've written a while loop to do this, and it's perfectly robust at present:
let id = latest
let referenceId = undefined
while (!!id) {
if (!!results[id]) {
referenceId = id
id = undefined
} else {
id = changes[id].previous
}
}
The approach is O(N) but that's the pathological case: I expect a long changelist but with fairly frequent results updates, such that you'd only have to walk back a few steps to find a matching result.
While loops can be vulnerable
Following the great work of Gene Krantz (read his book "Failure is not an option" to understand why NASA never use recursion!) I try to avoid using while loops in code bases: They tend to be susceptible to inadvertent mistakes.
For example, all that would be required to make an infinite loop here is to do delete changes.id1.
So, I'd like to avoid that vulnerability and instead fail to retrieve any result, because not returning a performance value can be handled; but the user's app hanging is REALLY bad!
Other approaches I tried
Sorted array O(N)
To avoid the while loop, I thought about sorting the changes object into an array ordered per the linked list, then simply looping through it.
The problem is that I have to traverse the whole changes list first to get the array in a sorted order, because I don't store an ordering key (it would violate the point of a linked list, because you could no longer do O(1) insert).
It's not a heavy operation, to push an id onto an array, but is still O(N).
The question
Is there a way of traversing this linked list without using a while loop, and without an O(N) approach to convert the linked list into a normal array?

Since you only need to append at the end and possibly remove from the end, the required structure is a stack. In JavaScript the best data structure to implement a stack is an array -- using its push and pop features.
So then you could do things like this:
const changes = [];
function addChange(id, patch) {
changes.push({id, patch});
}
function findRecentMatch(changes, constraints) {
for (let i = changes.length - 1; i >= 0; i--) {
const {id} = changes[i];
if (constraints[id]) return id;
}
}
// Demo
addChange("id1", { data: 10 });
addChange("id2", { data: 20 });
addChange("id3", { data: 30 });
addChange("id4", { data: 40 });
const results = {
id3: {performance: 83.6},
id1: {performance: 49.6},
}
const referenceId = findRecentMatch(changes, results);
console.log(referenceId); // id3
Depending on what you want to do with that referenceId you might want findRecentMatch to return the index in changes instead of the change-id itself. This gives you the possibility to still retrieve the id, but also to clip the changes list to end at that "version" (i.e. as if you popped all the entries up to that point, but then in one operation).

While writing out the question, I realised that rather than avoiding a while-loop entirely, I can add an execution count and an escape hatch which should be sufficient for the purpose.
This solution uses Object.keys() which is strictly O(N) so not technically a correct answer to the question - but it is very fast.
If I needed it faster, I could restructure changes as a map instead of a general object and access changes.size as per this answer
let id = latest
let referenceId = undefined
const maxLoops = Object.keys(changes).length
let loop = 0
while (!!id && loop < maxLoops) {
loop++
if (!!results[id]) {
referenceId = id
id = undefined
} else {
id = changes[id].previous
}
}

Could JS object hashing explain this, and what can I do about it?

I've spent a couple of hours wondering why (with lots of experiments) a simple optimization to use less memory has caused a slowdown in a block of code that was not touched. Basically I've gone from working with long strings to short strings. That particular block of code is about 33% slower, according to my profiling. (It is very noticeable in the unit test, but won't matter in production; my effort is just to try and understand why.)
The code boils down to:
const cnts = new FreqCounts()
for (const id of myArray)cnts.inc(id)
Using this class:
class FreqCounts {
constructor () {
this.d = {}
}
inc (v) {
if (this.d[v]) this.d[v]++
else this.d[v] = 1
}
}
So, before, myArray looked something like this:
const myArray = [
'黄砂（こうさ、おうさ、黄沙とも）とは、特に中国を中心とした東アジア内陸部の砂漠または乾燥地域の砂塵が、強風を伴う砂塵嵐（砂嵐）などによって上空に巻き上げられ、春を中心に東アジアなどの広範囲に飛散し、地上に降り注ぐ気象現象。あるいは、この現象で飛散した砂自体のことである。',
//...usually 5 to 50 more entries here
'気象現象としての黄砂は、砂塵の元になる土壌の状態、砂塵を運ぶ気流など、大地や大気の条件が整うと、発生すると考えられている。',
]
And now it either looks like this:
const myArray = [
1234,
//...
77,
]
Or like this:
const myArray = [
'1234',
//...
'77',
]
The second way should be saving the implicit number to string conversion, yet turned out to be even slower.
Having run out of logical explanations, the only idea I have left is that, because those strings end up as keys in an object, that they all end up hashing to the same hashcode and I get lots of collisions. Does that sound plausible?
(By the way, only evaluated with Node 8.16; the slowdown was not visible when working with 2,000 sentences, but was with 32,000, so getting a fully reproducible example might be tricky.)
BACKGROUND
Previously each string was repeated maybe 100 times; that became very noticeable when serializing the object to disk. So I'm simply putting them in a lookup table (which also gets saved):
const lookup = []
...
const id = lookup.length //If storing number
//const id = '' + lookup.length //If storing string
lookup.push(s)
...
myArray.push(id)
This has achieved the memory goal: the disk file with realistic data reduced from 500KB to 50KB. But, as I said above, I didn't expect it to slow down my unit tests.
Is there a way to continue to store lookup table indices, but not get hash collisions? (If that is what the problem is.)

Is there a way to return the rest of a JavaScript array

Is there a way to return the rest of an array in JavaScript i.e the portion of the array that consists of all elements but the first element of the array?
Note: I do not ask for returning a new array e.g. with arr.slice(1) etc. and I do not want to chop off the first element of the array e.g. with arr.shift().
For example, given the array [3, 5, 8] the rest of the array is [5, 8] and if the rest of the array is changed, e.g. by an assignment (a destructive operation), the array also changes. I just figured out that as a test that proves the rest is the rest of the array but not a new array consists of the rest of the elements of the array.
Note: The following code example is to describe what I want, but not specifically what I want to do (i.e. not the operations I want to perform). What I want to do is in the every algorithm at the bottom.
var arr = [3, 5, 8];
var rest = rest(arr); // rest is [5, 8]
rest.push(13); // rest is [5, 8, 13] and hence the arr is [3, 5, 8, 13]
An example I possibly need this and I would want to have it is following algorithm and many other I am writing in that GitHub organization, in both of which I use always arr.slice(1):
function every(lst, f) {
if (lst.length === 0) {
return false;
} else {
if (f(lst[0]) === true) {
return every(lst.slice(1), f);
} else {
return false;
}
}
}
I think having what I ask for instead of arr.slice(1) would keep the memory usage of such algorithms and retain the recursive-functional style I want to employ.

No, this is generally not possible. There are no "views on" or "pointers to" normal arrays1.
You might use a Proxy to fake it, but I doubt this is a good idea.
1: It's trivial to do this on typed arrays (which are views on a backing buffer), but notice that you cannot push to them.
I possibly need this and I would want to have it for recursive-functional style algorithms where I currently use arr.slice(1) but would prefer to keep memory usage low
Actually, all of these implementations do have low memory usage - they don't allocate more memory than the input. Repeatedly calling slice(1) does lead to high pressure on the garbage collector, though.
If you were looking for better efficiency, I would recommend to
avoid recursion. JS engines still didn't implement tail recursion, so recursion isn't cheap.
not to pass around (new copies of) arrays. Simply pass around an index at which to start, e.g. by using an inner recursive function that closes over the array parameter and accesses array[i] instead of array[0]. See #Pointy's updated answer for an example.
If you were looking for a more functional style, I would recommend to use folds. (Also known as reduce in JavaScript, although you might need to roll your own if you want laziness). Implement your algorithms in terms of fold, then it's easy to swap out the fold implementation for a more efficient (e.g. iterative) one.
Last but not least, for higher efficiency while keeping a recursive style you can use iterators. Their interface might not look especially functional, but if you insist you could easily create an immutable wrapper that lazily produces a linked list.

please test this function
function rest(arr) {
var a = arr.slice(1);
a.push = function() {
for (var i = 0, l = arguments.length; i < l; i++) {
this[this.length] = arguments[i];
arr[this.length] = arguments[i];
}
return this.length;
};
return a;
}

Based on the code posted in the update to the question, it's clear why you might want to be able to "alias" a portion of an array. Here is an alternative that is more typical of how I would solve the (correctly) perceived efficiency problem with your implementation:
function every(lst, f) {
function r(index) {
if (index >= lst.length)
return true; // different from OP, but I think correct
return f(lst[index]) && r(index+1);
}
return r(0);
}
That is still a recursive solution to the problem, but no array copy is made; the array is not changed at all. The general pattern is common even in more characteristically functional programming languages (Erlang comes to mind personally): the "public" API for some recursive code is augmented by an "internal" or "private" API that provides some extra tools for keeping track of the progress of the recursion.
original answer
You're looking for Array.prototype.shift.
var arr = [1, 2, 3];
var first = arr.shift();
console.log(first); // 1
console.log(arr); // [2, 3]
This is a linear time operation: the execution cost is relative to the length of the original array. For most small arrays that does not really matter much, but if you're doing lots of such work on large arrays you may want to explore a better data structure.
Note that with ordinary arrays it is not possible to create a new "shadow" array that overlaps another array. You can do something like that with typed arrays, but for general purpose use in most code typed arrays are somewhat awkward.
The first limitation of typed arrays is that they are, of course, typed, which means that the array "view" onto the backing storage buffer gives you values of only one consistent type. The second limitation is that the only available types are numeric types: integers and floating-point numbers of various "physical" (storage) sizes. The third limitation is that the size of a typed array is fixed; you can't extend the array without creating a new backing buffer and copying.
Such limitations would be quite familiar to a FORTRAN programmer of course.
So to create an array for holding 5 32-bit integers, you'd write
var ints = new Int32Array(5);
You can put values into the array just like you put values into an ordinary array, so long as you get the type right (well close enough):
for (let i = 0; i < 5; i++)
ints[i] = i;
console.log(ints); // [0, 1, 2, 3, 4]
Now: to do what the OP asked, you'd grab the buffer from the array we just created, and then make a new typed array on top of the same buffer at an offset from the start. The offsets when doing this are always in bytes, regardless of the type used to create the original array. That's super useful for things like looking at the individual parts of a floating point value, and other "bit-banging" sorts of jobs, though of course that doesn't come up much in normal JavaScript coding. Anyway, to get something like the rest array from the original question:
var rest = new Int32Array(ints.buffer, 4);
In that statement, the "4" means that the new array will be a view into the buffer starting 4 bytes from the beginning; 32-bit integers being 4 bytes long, that means that the new view will skip the first element of the original array.

Since JavaScript can't do this, the only real solution to your problem is WebAssembly. Otherwise use Proxy.

Could someone help describe the two types of array storage in Javascript?

I'm reading this article about V8 on HTML5Rocks. The article is old but I understand almost none of it and it bothers me. I'm taking this 1 step at a time but could someone help me with the Arrays section?
The article states:
Arrays
In order to handle large and sparse arrays, there are two types of
array storage internally:
Fast Elements: linear storage for compact key sets
Dictionary Elements: hash table storage otherwise
It's best not to cause the array storage to flip from one type to
another.
Question:
What would a Fast Elements linear storage array look like?
What would a Dictionary Elements hash table array look like?
For prevention purposes, how would I "flip from one type to another"?

I will go a little other way round.
2) What would Dictionary Elements hash table Array look like?
A JavaScript object is a map from string to values. e.g.
var obj = {
"name": "Sherlock Holmes",
"address": "221B Baker Street"
}
V8 uses hash tables to represent objects unless using an optimized representation for special cases. This is much like a dictionary uses (words, meaning) pair.
Now, this hash table access is slow because initially all the keys and values in a hash table are undefined. On inserting a new pair, a hash is computed and the pair inserted at the insertion index. If there's already a key at that index, attempt to insert at next one and so on.
1) What would Fast Elements Linear storage Array look like?
In V8, an element is a property whose key is a non-negative integer (0, 1, 2, ...) i.e. a simple linear array whose properties can be accessed via a numerical index.
Fast elements are stored in a contiguous array. e.g.
var arr = [1, 2, 3];
They are a special case that is optimised for faster access as the index is already known and not to be computed.
3) For prevention purposes, How would I flip from one type to another?
For fast element, if you assign an index that's way past the end of the elements array, V8 may downgrade the elements to dictionary mode.
Reference: http://jayconrod.com/posts/52/a-tour-of-v8-object-representation

sir i can be wrong but according to your question what i have observed is explained
when we initialise an array it internally gets key as 0,1,2....etc as i pushed element with value its into array but array does not consider it
ex :
var arr = new Array()
arr[0] = 1
arr[1] = 2
arr[2] = "myname";
arr['myname'] = nick;
but when i do arr.length i get 3 so it does not consider the key apart from numeric but if i write arr[3] = {myname:'nick'} then it consider it as elements.
internally i think to keep the linear array different it looks for '{}'

Can I make a "Virtual Array" in JavaScript?

I'm calling a JavaScript function that wants an array of things to display. It displays a count, and displays the items one by one. Everything works when I pass it a normal JavaScript array.
But I have too many items to hold in memory at once. What I'd like to do, is pass it an object with the same interface as an array, and have my method(s) be called when the function tries to access the data. And in fact, if I pass the following:
var featureArray = {length: count, 0: func(0)};
then the count is displayed, and the first item is correctly displayed. But I don't want to assign all the entries, or I'll run out of memory. And the function currently crashes when the user tries to display the second item. I want to know when item 1 is accessed, and return func(1) for item 1, and func(2) for item 2, etc. (i.e., delaying the creation of the item until it is requested).
Is this possible in JavaScript?

If I understand correctly, this would help:
var object = {length: count, data: function (whatever) {
// create your item
}};
Then, instead of doing array[1], array[2], et cetera, you'd do object.data(1), object.data(2), and so on.

Since there seems to be a constraint that the data must be accessed using array indexing via normal array indexing arr[index] and that can't be changed, then the answer is that NO, you can't override array indexing in Javascript to change how it works and make some sort of virtual array that only fetches data upon demand. It was proposed for ECMAScript 4 and rejected as a feature.
See these two other posts for other discussion/confirmation:
How would you overload the [] operator in Javascript
In javascript, can I override the brackets to access characters in a string?
The usual way to solve this problem would be to switch to using a method such as .get(n) to request the data and then the implementor of .get() can virtualize however much they want.
P.S. Others indicate that you could use a Proxy object for this in Firefox (not supported in other browsers as far as I know), but I'm not personally familiar with Proxy objects as it's use seems rather limited to code that only targets Firefox right now.

Yes, generating items on the go is possible. You will want to have a look at Lazy.js, a library for producing lazily computed/loaded sequences.
However, you will need to change your function that accepts this sequence, it will need to be consumed differently than a plain array.
If you really need to fake an array interface, you'd use Proxies. Unfortunately, it is only a harmony draft and currently only supported in Firefox' Javascript 1.8.5.
Assuming that the array is only accessed in an iteration, i.e. starting with index 0, you might be able to do some crazy things with getters:
var featureArray = (function(func) {
var arr = {length: 0};
function makeGetter(i) {
arr.length = i+1;
Object.defineProperty(arr, i, {
get: function() {
var val = func(i);
Object.defineProperty(arr, i, {value:val});
makeGetter(i+1);
return val;
},
configurable: true,
enumerable: true
});
}
makeGetter(0);
return arr;
}(func));
However, I'd recommend to avoid that and rather switch the library that is expecting the array. This solution is very errorprone if anything else is done with the "array" but accessing its indices in order.

Thank you to everyone who has commented and answered my original question - it seems that this is not (currently) supported by JavaScript.
I was able to get around this limitation, and still do what I wanted. It uses an aspect of the program that I did not mention in my original question (I was trying to simplify the question), so it is understandable that other's couldn't recommend this. That is, it doesn't technically answer my original question, but I'm sharing it in case others find it useful.
It turns out that one member of the object in each array element is a callback function. That is (using the terminology from my original question), func(n) is returning an object, which contains a function in one member, which is called by the method being passed the data. Since this callback function knows the index it is associated with (at least, when being created by func(n)), it can add the next item in the array (or at least ensure that it is already there) when it is called. A more complicated solution might go a few ahead, and/or behind, and/or could cleanup items not near the current index to free memory. This all assumes that the items will be accessed consecutively (which is the case in my program).
E.g.,
1) Create a variable that will stay in scope (e.g., a global variable).
2) Call the function with an object like I gave as an example in my original question:
var featureArray = {length: count, 0: func(0)};
3) func() can be something like:
function func(r) {
return {
f : function() {featureArray[r + 1] = func(r + 1); DoOtherStuff(r); }
}
}
Assuming that f() is the member with the function that will be called by the external function.

Develop Reference

JavaScript is the programming language of the Web.