What's the time complexity of array.splice() in Google Chrome? - javascript

If I remove one element from an array using splice() like so:
arr.splice(i, 1);
Will this be O(n) in the worst case because it shifts all the elements after i? Or is it constant time, with some linked list magic underneath?

Worst case should be O(n) (copying all n-1 elements to new array).
A linked list would be O(1) for a single deletion.
For those interested I've made this lazily-crafted benchmark. (Please don't run on Windows XP/Vista). As you can see from this though, it looks fairly constant (i.e. O(1)), so who knows what they're doing behind the scenes to make this crazy-fast. Note that regardless, the actual splice is VERY fast.
Rerunning an extended benchmark directly in the V8 shell that suggest O(n). Note though that you need huge array sizes to get a runtime that's likely to affect your code. This should be expected as if you look at the V8 code it uses memmove to create the new array.

The Test:
I took the advice in the comments and wrote a simple test to time splicing a data-set array of size 3,000, each one containing 3,000 items in it. The test would simply splice the
first item in the first array
second item in the second array
third item in the third array
...
3000th item in the 3000th array
I pre-built the array to keep things simple.
The Findings:
The weirdest thing is that the number of times where the process of the splice even takes longer than 1ms grows linearly as you increase the size of the dataset.
I went as far as testing it for a dataset of 300,000 on my machine (but the SO snippet tends to crash after 3,000).
I also noticed that the number of splice()s that took longer than 1ms for a given dataset (30,000 in my case) was random. So I ran the test 1,000 times and plotted the number of results, and it looked like a standard distribution; leading me to believe that the randomness was just caused by the scheduler interrupts.
This goes against my hypothesis and #Ivan's guess that splice()ing from the beginning of an array will have a O(n) time complexity
Below is my test:
let data = []
const results = []
const dataSet = 3000
function spliceIt(i) {
data[i].splice(i, 1)
}
function test() {
for (let i=0; i < dataSet; i++) {
let start = Date.now()
spliceIt(i);
let end = Date.now()
results.push(end - start)
}
}
function setup() {
data = (new Array(dataSet)).fill().map(arr => new Array(dataSet).fill().map(el => 0))
}
setup()
test()
// console.log("data before test", data)
// console.log("data after test", data)
// console.log("all results: ", results)
console.log("results that took more than 1ms: ", results.filter(r => r >= 1))

¡Hi!
I did an experiment myself and would like to share my findings. The experiment was very simple, we ran 100 splice operations on an array of size n, and calculate the average time each splice function took. Then we varied the size of n, to check how it behave.
This graph summarizes our findings for big numbers:
For big numbers it seems to behave linearly.
We also checked with "small" numbers (they were still quite big but not as big):
On this case it seems to be constant.
If I would have to decide for one option I would say it is O(n), because that is how it behaves for big numbers. Bear in mind though, that the linear behaviour only shows for VERY big numbers.
However, It is hard to go for a definitive answer because the array implementation in javascript dependes A LOT on how the array is declared and manipulated.
I recommend this stackoverflow discussion and this quora discussion to understand how arrays work.
I run it in node v10.15.3 and the code used is the following:
const f = async () => {
const n = 80000000;
const tries = 100;
const array = [];
for (let i = 0; i < n; i++) { // build initial array
array.push(i);
}
let sum = 0;
for (let i = 0; i < tries; i++) {
const index = Math.floor(Math.random() * (n));
const start = new Date();
array.splice(index, 1); // UNCOMMENT FOR OPTION A
// array.splice(index, 0, -1); // UNCOMMENT FOR OPTION B
const time = new Date().getTime() - start.getTime();
sum += time;
array.push(-2); // UNCOMMENT FOR OPTION A, to keep it of size n
// array.pop(); // UNCOMMENT FOR OPTION B, to keep it of size n
}
console.log('for an array of size', n, 'the average time of', tries, 'splices was:', sum / tries);
};
f();
Note that the code has an Option B, we did the same experiment for the three argument splice function to insert an element. It worked similary.

Related

Create array with length n, initialize all values with 0 besides one which index matches a certain condition, in O(1)?

I want to create an array of type number with length n. All values inside the array should be 0 except the one which index matches a condition.
Thats how i currently do it:
const data: number[] = [];
for (let i = 0; i < n; i++) {
if (i === someIndex) {
data.push(someNumber);
} else {
data.push(0);
}
}
So lets say n = 4, someIndex = 2, someNumber = 4 would result in the array [0, 0, 4, 0].
Is there a way to do it in O(1) instead of O(n)?
Creating an array of size n in O(1) time is theoretically possible depending on implementation details - in principle, if an array is implemented as a hashtable then its length property can be set without allocating or initialising space for all of its elements. The ECMAScript specification for the Array(n) constructor doesn't mandate that Array(n) should do anything which necessarily takes more than O(1) time, although it also doesn't mandate that the time complexity is O(1).
In practice, Array(n)'s time complexity depends on the browser, though verifying this is a bit tricky. The performance.now() function can be used to measure the time elapsed between the start and end of a computation, but the precision of this function is artificially reduced in many browsers to protect against CPU-timing attacks like Spectre. To get around this, we can call the constructor repetitions times, and then divide the time elapsed by repetitions to get a more precise measurement per constructor call.
My timing code is below:
function timeArray(n, repetitions=100000) {
var startTime = performance.now();
for(var i = 0; i < repetitions; ++i) {
var arr = Array(n);
arr[n-1] = 'foo';
}
var endTime = performance.now();
return (endTime - startTime) / repetitions;
}
for(var n = 10000; n <= 1000000; n += 10000) {
console.log(n, timeArray(n));
}
Here's my results from Google Chrome (version 74) and Firefox (version 72); on Chrome the performance is clearly O(n) and on Firefox it's clearly O(1) with a quite consistent time of about 0.01ms on my machine.
I measured using repetitions = 1000 on Chrome, and repetitions = 100000 on Firefox, to get accurate enough results within a reasonable time.
Another option proposed by #M.Dietz in the comments is to declare the array like var arr = []; and then assign at some index (e.g. arr[n-1] = 'foo';). This turns out to take O(1) time on both Chrome and Firefox, both consistently under one nanosecond:
That suggests the version using [] is better to use than the version using Array(n), but still the specification doesn't mandate that this should take O(1) time, so there may be other browsers where this version takes O(n) time. If anybody gets different results on another browser (or another version of one of these browsers) then please do add a comment.
You need to assign n values, and so there is that amount of work to do. The work increases linearly with increasing n.
Having said that, you can hope to make your code a bit faster by making use of .fill:
const data: number[] = Array(n).fill(0);
data[someIndex] = someNumber;
But don't be mistaken; this is still O(n): .fill may be faster, but it still requires to fill the whole array with zeroes, which means a corresponding size of memory needs to be initialised, so that operation has linear time complexity.
If however you drop the requirement that zeroes need to be assigned, then you can only store the someNumber:
const data: number[] = Array(n);
data[someIndex] = someNumber;
This way you actually do not allocate the memory for the whole array, so this code snippet runs in constant time. Any access to an index different from someIndex will give you a value of undefined. You may trap that condition and translate that to a zero on-the-fly:
let value = i in data ? data[i] : 0;
Obviously, if you are going to access all indices of the array like that, you'll have again a linear time complexity.

Fastest way to compare each item in array with rest of array?

I have an array of items, and for each of the item in the array, I need to do some check against the rest of the items in the same array.
Here is the code I am using:
const myArray = [ ...some stuff ];
let currentItem;
let nextItem;
for (let i = 0; i < myArray.length; i++) {
currentItem = myArray[i];
for (let j = i + 1; j < myArray.length; j++) {
nextItem = myArray[j];
doSomeComparision(currentItem, nextItem);
}
}
While this works, I need to find a more efficient algorithm because it slows down significantly if the array is very big.
Can someone provide some advice on how to make this algorithm better?
Edit 1
I apologize.
I should have provided more context around what I am trying to do here.
I am using the loop above with a HalfEdge data structure, a.k.a. DCEL.
Basically, a HalfEdge is an object with 3 properties:
class HalfEdge = {
head: // some (x,y,z) coords
tail: // some (x,y,z) coords
twin: // reference to another HalfEdge
}
A twin of a given HalfEdge is defined like so:
/**
* if two Half-Edges are twins:
* Edge A TAIL ----> HEAD
* = =
* Edge B HEAD <---- TAIL
*/
My array contains many HalfEdges, and for each HalfEdge in the array, I want to find its twin (i.e., one that satisfies the condition above).
Basically, I am comparing two 3D vectors (one from currentItem, the other from nextItem).
Edit 2
Fixed typo in code example (i.e., from let j = 0 to let j = i + 1)
Here is a linear-time solution to your problem. I am not that familiar with javascript, so I'll feel more comfortable about giving you the algorithm correctly in psuedo-code.
lookup := hashtable()
for i .. myArray.length
twin_id := lookup[myArray[i].tail, myArray[i].head]
if twin_id != null
myArray[i].twin := twin_id
myArray[twin_id].twin := i
else
lookup[myArray[i].head, myArray[i].tail] = i
The idea is to construct a hash table of (head, tail) pairs, and to check if a (tail, head) pair already exists that matches the current node's. If so, they are twins, and mark them as such, otherwise update the hash table with a new entry. Every element is looped over exactly once, and insertion / retrieval from the hash table is done in constant time.
I don't know whether there's any kind of specific algorithm that is more efficient, but the following optimizations come to my mind immediately:
Let j start with i+1 - otherwise you are comparing all items twice
against each other
- Initialize a variable with myArray.length outside
the loops as the same operation is done twice.
If the comparison
is any kind of direct 'equal / larger' then it could help to sort the
array first
Update on Edit 1
I think the optimization depends on the number of expected matches. I.e., if all HalfEdge objects have a twin, then I think you're current approach with the changes above is already pretty optimal.
However, if the percentage of expected twins is rather low, then I would suggest the following:
- Extract a list of all heads and a list of all tails, sort them, and compare against each other. Remember which heads have found a twin tail.
Then, do you original loops again, but only enter the inner loop for the heads which found a match.
Not sure this is optimal, but I hope you get my approach.
Without knowing more information about the type of items
1) You should first sort your array, aftewards the comparisson can be done forward only, it should then give you a complexity of o(log n) + n^2, this could be useful depending on the type of your items and could lead to more improvements.
2) Starting the internal loop from i + 1 should reduce it further to o(log n + n)
const myArray = [ ...some stuff ].sort((a,b) => sortingComparison(a,b)); // sorting comparison must return a number
let currentItem;
let nextItem;
for (let i = 0; i < myArray.length; i++) {
currentItem = myArray[i];
for (let j = i + 1; j < myArray.length; j++) {
nextItem = myArray[j];
doSomeComparision(currentItem, nextItem);
}
}
Bonus:
Here is some fancy functional code (if you are aiming for raw performance the for loops versions are faster)
function compare(value, array) {
array.forEach((nextValue) => {
// Do your coparisson here
// nextValue === value
}
}
const myArray = [items]
myArray
.sort((a,b) => (a-b))
.forEach((v, idx) => compare(v, myArray.slice(idx, myArray.length))
Since values are 3D coordinates, build an octree ( O(N) ) and add items on their HEAD values. Then from each of them, follow them to their TAIL values using already built octree ( O(Nklog(N)) ) with its nodes containing maximum of k edges which means only k comparisons at the lowest level nodes of each TAIL. Also finding each TAIL may need traveling up to log(N) levels of octree from top to bottom.
O(N) with constant of building octree + O(N * k * log(N)) with low enough k edges per node(and logN levels of octree).
When you follow a TAIL in octree, any HEAD with same value would be in same node with maximum k elements or any "close enough" HEAD value would be inside that lowest level node and its closest neighbors.
Are you looking for an exact HEAD==TAIL or some tolerance is used? Tolerance could need "loose octree" imo.
If each edge has a length defined, then you can constrain the search radius by this value, if edges are both ways symmetric.
For up to 5k - 10k edges, there may be only 5-10 levels in octree depending on edges per node limit and if this limit is picked to be around 2-4 then each HEAD would need to do only 10-40 operations to find its twin edge with same TAIL value.

Javascript - Time and space complexity of splice and concat inside loop

I have a problem which requires a string to be transformed into another one by appending copies of its' initial value to itself. The problem allows to remove single characters at some places.
Explanation
let x = "abba"; // First string
let y = "aba" // Second initial string
y("aba") => remove last "a" => y("ab") => y+initialY = "ab"+"aba" =>
y("ababa") => remove char at index 2 => y("abba") => y == x => sucess
My algorithm successfully solves the problem:
let x = "abbbbcccaaac"
let y = "abc"
let xArr = x.split('')
let yArr = y.split('')
let count = 0;
for (let i = 0; i < xArr.length; i++) {
if(yArr[i] == undefined) {
yArr = yArr.concat(y.split(''));
count++;
}
if(xArr[i] != yArr[i]) {
yArr.splice(i, 1);
i--;
}
}
console.log("Output is:", yArr.join(''))
console.log("Appends in order to transform:", count)
The algorithm works as intended, however, I am uncertain regarding its time and space complexity and most importantly - efficiency.
Is this algorithm in O(n) time complexity where n is the length of x?
If this is not O(n), can the problem be solved in O(n) time?
Does .concat(), .splice() or .split() somehow change the time complexity since they are nested in a for loop? What if they weren't, do they still change the time complexity of an algorithm and by how much?
Given the rules of this problem, is this an efficient way to solve it?
What is the space complexity of this algorithm?
Normally a question like this is quite difficult to give a definite answer to, because different implementations of Javascript have different time complexities for basic array operations (such as creating a new array of size n). Javascript arrays will typically be implemented either as dynamic arrays or hashtables, and these data structures have different performance characteristics.
So, there is no definitive time complexity for splice to remove one element from an array. What we can say is that removing one element takes linear time for a dynamic array, and as #Ry- points out in the comments, also linear time for a hashtable, because of the need to renumber the later indices. We can also say that it's highly likely one of these two data structures is used, and no sensible implementation will take more than linear time to do splice.
Either way, the worst case for your algorithm is when x = 'aa...aa' and y = 'abb...bb', i.e. x is n copies of 'a', and y is 'a' followed by (m - 1) copies of 'b'.
For a dynamic array or a hashtable, then the time complexity for just the splice operations is O(nm²). This is because the outer loop iterates O(nm) times (note the i-- inside the loop, which happens every time the letter 'b' needs to be removed), and the splice operation requires shifting or renumbering O(m) elements in yArr after index i.
But suppose some more exotic data structure is used which supports removing an element in sub-linear time (e.g. a skip list). In that case, the above only gives O(nm) times the complexity of the "remove" operation. But we haven't counted concat yet; this creates a new data structure and copies every item into it, which will still take linear time. concat is called O(n) times and takes an average of O(n + m) time per call, so the complexity of just the concat operations is O(n² + nm).
So the time complexity is very likely O(n² + nm²), and certainly at least O(n² + nm); not linear.
The space complexity is O(n), since the length of yArr is never more than twice as long as xArr.

Why is creating objects with alphanumeric keys so slow in new node versions?

I run this test in different node versions:
function test() {
var i;
var bigArray = {};
var start = new Date().getTime();
for (i=0; i<100000; i+=1) {
bigArray[i] = {};
var j= Math.floor(Math.random() * 10000000);
bigArray[i]["a" + j] = i.toString(32);
if (i % 1000 === 0) console.log(i);
}
var end = new Date().getTime();
var time = end - start;
console.log('Execution time: ' + time);
}
test();
As you can see, it just creates an object with 100000 fields where each field is just an object with just one field. The key of this inner object is forced to be alphanumeric (if the key is numeric, it performs normal).
When I run this test in different javascript implementations/versions I get this results:
v0.8.28 -> 2716 ms
v0.10.40 -> 73570 ms
v0.12.7 -> 92427 ms
iojs v2.4.0 -> 510 ms
chrome -> 1473 ms
I have also tried to run this test in an asynchronous loop (each loop step in in a different tick), but the results are similar to the ones showed above.
I can't understand why this test is so expensive in newer node versions.
Why is it so slow?
Is there any special v8 flag that can improve this test?
In order to handle large and sparse arrays, there are two types of array storage internally:
Fast Elements: linear storage for compact key sets
Dictionary Elements: hash table storage otherwise
It's best not to cause the array storage to flip from one type to another.
Therefore:
Use contiguous keys starting at 0 for Arrays
Don't pre-allocate large Arrays (e.g. > 64K elements) to their maximum size, instead grow as you go
Don't delete elements in arrays, especially numeric arrays
Don't load uninitialized or deleted elements
Source and more info: http://www.html5rocks.com/en/tutorials/speed/v8/
PS: this is supposed to improve considerably in the upcoming node.js+io.js version.

Which way of storing and operating on bitfields in javascript is the fastest? (200k+ bits)

I am profiling my javascript code intended to be used on embedded browser on Android (PhoneGap).
Basically I need a very large bitfield (200k+ bits) for my calculations.
I've tried to put them into array of unsigned integers with each item storing 32 bits - this indeed reduced memory usage but made execution time drastically too slow (over 30 seconds for simple iterating and reversing all bits in the bitfield on modern PC!)
Than I made good old fashion array of bools. This increased memory usage (but still it was less than 15 mega on Android for entire PhoneGap framework around my code). Profiling showed me that initial step in my algorithm - setting all elements of the bitfield to 1 (simple for- loop) - takes half of the execution time (~1.5 seconds on PC, more than few minutes on Android). I can rewrite my code so default value would be 0 not 1 (reverse all conditions), but I still don't know how to set such large array to 0'es fast.
Edit adding my code, as requested:
var count = 200000;
var myArr = [];
myArr.length = count;
for(var i = 0; i < count ; i++)
myArr[i] = true;
Could someone point me how can I clear very large array, or is there any faster way to store and operate on large bitfields in javascript?
See if this is a faster way to create the array:
var myArray = [true];
var desiredLength = 200000;
while (myArray.length < desiredLength) {
myArray = myArray.concat(myArray);
}
if (myArray.length > desiredLength) {
myArray.splice(desiredLength);
}
I've added a few more test cases to the jsperf page that Asad linked in his comment. By far the fastest in my browser (Chrome 23.0.1271.101 on Mac OS X 10.8.2) is this one:
var count = 200000;
var myArr = [];
for (var i = 0; i < count; i++) {
myArr.push(true);
}
Why pre-fill the array in the first place! Use undefined to your advantage. Remember that undefined acts as a falsey value. So it will act exactly like 0/false when you do a boolean check.
var myArray = new Array(200000);
if (myArray[1]) {
//I am a truthy value
} else {
//I am a falsey value
}
So when you initialize the array this way, there is no reason to prefill! That means no extra processing and take advantage of the sparse Array!

Categories

Resources