Why do the following two pieces of code run so differently? - javascript

Look at these two pieces of code, the second only add the third line. But time is 84 times. Anybody can explain why?
let LIMIT = 9999999;
let arr = new Array(LIMIT);
// arr.push(1);
console.time('Array insertion time');
for (let i = 1; i < LIMIT; i++) {
arr[i] = i;
}
console.timeEnd('Array insertion time');
let LIMIT = 9999999;
let arr = new Array(LIMIT);
arr.push(1);
console.time('Array insertion time');
for (let i = 1; i < LIMIT; i++) {
arr[i] = i;
}
console.timeEnd('Array insertion time');

The arr.push(1) operation creates a "sparse" array: it has a single element present at index 9999999. V8 switches the internal representation of such a sparse array to "dictionary mode", i.e. the array's backing store is an index→element dictionary, because that's significantly more memory efficient than allocating space for 10 million elements when only one of them is used.
The flip side is that accessing (reading or writing) elements of a dictionary-mode array is slower than for arrays in "fast/dense mode": every access has to compute the right dictionary index, and (in the scenario at hand) the dictionary has to be grown several times, which means copying all existing elements to a new backing store.
As the array is filled up, V8 notices that it's getting denser, and at some point transitions it back to "fast/dense mode". By then, most of the slowdown has already been observed. The remainder of the loop has some increased cost as well though, because by this time, the arr[i] = i; store has seen two types of arrays (dictionary mode and dense mode), so on every iteration it must detect which state the array is in now and handle it accordingly, which (unsurprisingly) costs more time than not having to make that decision.
Generalized conclusion: with JavaScript being as dynamic and flexible as it is, engines can behave quite differently for very similar-looking pieces of code; for example because the engine optimizes one case for memory consumption and the other for execution speed, or because one of the cases lets it use some shortcut that's not applicable for the other (for whatever reason). The good news is that in many cases, correct and understandable/intuitive/simple code also tends to run quite well (in this example, the stray arr.push looks a lot like a bug).

Related

What is the time complexity of this recursive solution for removing duplicates?

After I complete a Leetcode question, I always try to also determine the asymptotic time complexity, for practice.
I am now looking at problem 26. Remove Duplicates from Sorted Array:
Given a sorted array nums, remove the duplicates in-place such that
each element appears only once and returns the new length.
Do not allocate extra space for another array, you must do this by
modifying the input array in-place with O(1) extra memory.
Clarification:
Confused why the returned value is an integer but your answer is an
array?
Note that the input array is passed in by reference, which means a
modification to the input array will be known to the caller as well.
Internally you can think of this:
// nums is passed in by reference. (i.e., without making a copy) int
len = removeDuplicates(nums);
// any modification to nums in your function would be known by the caller.
// using the length returned by your function, it prints the first len elements.
for (int i = 0; i < len; i++) {
print(nums[i]);
}
Example 1:
Input: nums = [1,1,2]
Output: 2, nums = [1,2]
Explanation: Your
function should return length = 2, with the first two elements of nums
being 1 and 2 respectively. It doesn't matter what you leave beyond
the returned length.
My code:
/**
* #param {number[]} nums
* #return {number}
*/
var removeDuplicates = function(nums) {
nums.forEach((num,i) => {
if(nums[i+1] !== null && nums[i+1] == nums[i] ){
nums.splice(i, 1);
console.log(nums)
removeDuplicates(nums)
}
})
return nums.length;
};
For this problem, I got O(log n) from my research. Execution time halves each time it runs. Can someone please verify or determine if I am wrong?
Are all recursive functions inherently O(logn)? Even if there are multiple loops?
For this problem, I got O(log n) from my research. Execution time halves for each time it's run. Can someone please verify or determine if I am wrong?
The execution time does not halve for each run: imagine an extreme case where the input has 100 values and they are all the same. Then at each level of the recursion tree one of those duplicates will be found and removed. Then a deeper recursive call is made. So for every duplicate value there is a level in the recursion tree. So in this extreme case, the recursion tree will have a depth of 99.
Even if you would revise the algorithm, it would not be possible to make it O(log n), as all values in the array need to be read at least once, and that alone already gives it a time complexity of O(n).
Your implementation uses splice which needs to shift all the values that follow the deletion point, so one splice is already O(n), making your algorithm O(n²) (worst case).
Because of the recursion, it also uses O(n) extra space in the worst case (for the call stack).
Are all recursive functions inherently O(logn)?
No. Using recursion does not say anything about the overall time complexity. It could be anything. You typically get O(logn) when you can ignore O(n) (like half) of the current array when making the recursive call. This is for instance the case with a Binary Search algorithm.
Improvement
You can avoid the extra space by not using recursion, but an iterative method. Also, you are not required to actually change the length of the given array, only to return what its new length should be. So you can avoid using splice. Instead, use two indexes in the array: one that runs to the next character that is different, and another, a slower one, to which you copy that new character. When the faster index reaches the end of the input, the slower one indicates the size of the part that has the unique values.
Here is how that looks:
var removeDuplicates = function(nums) {
if (nums.length == 0) return 0;
let len = 1;
for (let j = 1; j < nums.length; j++) {
if (nums[j-1] !== nums[j]) nums[len++] = nums[j];
}
return len;
};

How does V8 optimise the creation of very large arrays?

Recently, I had to work on optimising a task that involved the creation of really large arrays (~ 10⁸ elements).
I tested a few different methods, and, according to jsperf, the following option seemed to be the fastest.
var max = 10000000;
var arr = new Array(max);
for (let i = 0; i < max; i++) {
arr[i] = true;
}
Which was ~ 85% faster than
var max = 10000000;
var arr = [];
for (let i = 0; i < max; i++) {
arr.push(true);
}
And indeed, the first snippet was much faster in my actual app as well.
However, my understanding was that the V8 engine was able to perform optimised operations on array with PACKED_SMI_ELEMENTS elements kind, as opposed to arrays of HOLEY_ELEMENTS.
So my question is the following:
if it's true that new Array(n) creates an array that's internally marked with HOLEY_ELEMENTS, (which I believe is true) and
if it's true that [] creates an array that's internally marked with PACKED_SMI_ELEMENTS (which I'm not too sure is true)
why is the first snippet faster than the second one?
Related questions I've been through:
Create a JavaScript array containing 1...N
Most efficient way to create a zero filled JavaScript array?
V8 developer here. The first snippet is faster because new Array(max) informs V8 how big you want the array to be, so it can allocate an array of the right size immediately; whereas in the second snippet with []/.push(), the array starts at zero capacity and has to be grown several times, which includes copying its existing elements to a new backing store.
https://www.youtube.com/watch?v=m9cTaYI95Zc is a good presentation but probably should have made it clearer how small the performance difference between packed and holey elements is, and how little you should worry about it.
In short: whenever you know how big you need an array to be, it makes sense to use new Array(n) to preallocate it to that size. When you don't know in advance how large it's going to be in the end, then start with an empty array (using [] or new Array() or new Array(0), doesn't matter) and grow it as needed (using a.push(...) or a[a.length] = ..., doesn't matter).
Side note: your "for loop with new Array() and push" benchmark creates an array that's twice as big as you want.

Javascript array splice causing out of memory exception?

I am basically trying to sort an input of numbers on the fly by inserting the numbers to the correct position (not 100% sure but this should be insertion sort). My understanding is that to insert into an array in javascript you need to use the array splice method http://www.w3schools.com/jsref/jsref_splice.asp .
My code in attempt of achieving my goal is as below:
var N = parseInt(readline());
var powers = [0];
for (var i = 0; i < N; i++) {
var pi = parseInt(readline());
for(var j=i;j<powers.length; j++ ){
if(powers[j]>pi){
powers.splice(j,0,pi);
}
else if(j+1==powers.length){
powers[j+1]=pi;
}
}
}
When I run this code I get an out of memory exception. I just want to understand is what I am doing wrong in the code above. If I am using the splice method wrong and it is the cause of the memory leak, what is actually happening under the hood?
I know there are other ways I could do this sorting but I am particularly interested in doing an insertion sort with javascript arrays.
In your else condition, you're adding to the array, making it one longer. That means when the loop next checks powers.length, it will be a higher number, which means you'll go into the loop body again, which means you'll add to the array again, which means you'll go back into the loop body again, which means...you see where this is going. :-)
Once you've added the number to the array (regardless of which branch), exit the loop (for instance, with break).
Side note: You won't be doing a proper insertion sort if you start j at i as you are currently. i is just counting how many entries the user said they were going to enter, it's not part of the sort. Consider: What if I enter 8 and then 4? If you start j at i, you'll skip over 8 and put 4 in the wrong place. j needs to start at 0.

Do arrays with gaps in their indices entail any benefits that compensate their disadvantages

In Javascript arrays may have gaps in their indices, which should not be confused with elements that are simply undefined:
var a = new Array(1), i;
a.push(1, undefined);
for (i = 0; i < a.length; i++) {
if (i in a) {
console.log("set with " + a[i]);
} else {
console.log("not set");
}
}
// logs:
// not set
// set with 1
// set with undefined
Since these gaps corrupt the length property I'm not sure, if they should be avoided whenever possible. If so, I would treat them as edge case and not by default:
// default:
function head(xs) {
return xs[0];
}
// only when necessary:
function gapSafeHead(xs) {
var i;
for (i = 0; i < xs.length; i++) {
if (i in xs) {
return xs[i];
}
}
}
Besides the fact that head is very concise, another advantage is that it can be used on all array-like data types. head is just a single simple example. If such gaps need to be considered throughout the code, the overhead should be significantly.
This is likely to come up in any language that overloads hash tables to provide something that colloquially is called an "array". PHP, Lua and JavaScript are three such languages. If you depend on strict sequential numeric array behavior, then it will be an inconvenience for you. More generally, the behavior provides conveniences as well.
Here's a classic algorithm question: to delete a member from the middle of a data structure, which data structure is "better": A linked list or an array?
You're supposed to say "linked list", because deleting a node from a linked list doesn't require you to shift the rest of the array down one index. But linked lists have other pitfalls, so is there another data structure we can use? You can use a sparse array*.
In many languages that provide this hashy type of arrays, removing any arbitrary member of the array will change the length. Unfortunately, JavaScript does not change the length, so you lose out a little there. But nevertheless, the array is "shorter", at least from the Object.keys perspective.
*Many sparse arrays are implemented using linked lists, so don't apply this too generally. In these languages, though, they're hash tables with predictable ordered numeric keys.
Of course, the question is a subjective one, but I argue that the gaps should certainly be avoided, if possible. Arrays are special Javascript objects with very specific purposes. You can totally hack on arrays, manipulate the length property, add properties with keys other than numbers (e.g myArray["foo"] = "bar"), but these mostly devolve into antipatterns. If you need some special form of pseudo-array, you can always just code it yourself with a regular object. After all, typeof [] === "object"
It's not like gaps inherently break your code, but I would avoid pursuing them intentionally.
Does that answer your question?

Looping over undefined array keys

Problem:
I have a DB containing math exercises, split by difficulty levels and date taken.
i want to generate a diagram of the performance over time.
to achieve this, i loop through the query results, and ++ a counter for the level and day the exercise was taken.
example: level 2 exercise was taken at 01.11.2015.
this.levels[2].daysAgo[1].amountTaken++;
with this, i can build a diagram, where day 0 is always today, and the performance over days is shown.
now levels[] has a predefined amount of levels, so there is no problem with that.
but daysAgo[] is very dynamic (it even changes daily with the same data), so if there was only one exercise taken, it would wander on a daily basis (from daysAgo[0] to daysAgo[1] and so on).
the daysAgo[] between that would be empty (because there are no entries).
but for evaluating the diagram, i need them to have an initialized state with amountTaken: 0, and so on.
problem being: i can't know when the oldest exercise was.
Idea 1:
First gather all entries in a kind of proxy object, where i have a var maxDaysAgo that holds the value for the oldest exercise, then initialize an array daysAgo[maxDaysAgo] that gets filled with 0-entries, before inserting the actual entries.
that seems very clumsy and overly complicated
Idea 2:
Just add the entries this.level[level].daysAgo[daysAgo].amountTaken++;, possibly leaving the daysAgo array with a lot of undefined keys.
Then, after all entries are added, i would loop over the daysAgokeys with
for (var i = 1; i < this.maxLevel; i++) { // for every level
for (var j = 0; j < this.levels[i].daysAgo.length; j++) {
but daysAgo.lengthwill not count undefined fields, will it?
So if i have one single entry at [24], length will still be 1 :/
Question:
How can I find out the highest key in an array and loop until there, when there are undefined keys between?
How can i adress all undefined keys up until the highest (and not any more)?
Or: what would be a different, more elegant way to solve this whole problem altogether?
Thanks :)
array.length returns one higher than the highest numerical index, so can be used to loop though even undefined values
as a test:
var a=[]
a[24]=1
console.log(a.length)
outputs 25 for me (in chrome and firefox).

Categories

Resources