Fastest way to compare each item in array with rest of array? - javascript

I have an array of items, and for each of the item in the array, I need to do some check against the rest of the items in the same array.
Here is the code I am using:
const myArray = [ ...some stuff ];
let currentItem;
let nextItem;
for (let i = 0; i < myArray.length; i++) {
currentItem = myArray[i];
for (let j = i + 1; j < myArray.length; j++) {
nextItem = myArray[j];
doSomeComparision(currentItem, nextItem);
}
}
While this works, I need to find a more efficient algorithm because it slows down significantly if the array is very big.
Can someone provide some advice on how to make this algorithm better?
Edit 1
I apologize.
I should have provided more context around what I am trying to do here.
I am using the loop above with a HalfEdge data structure, a.k.a. DCEL.
Basically, a HalfEdge is an object with 3 properties:
class HalfEdge = {
head: // some (x,y,z) coords
tail: // some (x,y,z) coords
twin: // reference to another HalfEdge
}
A twin of a given HalfEdge is defined like so:
/**
* if two Half-Edges are twins:
* Edge A TAIL ----> HEAD
* = =
* Edge B HEAD <---- TAIL
*/
My array contains many HalfEdges, and for each HalfEdge in the array, I want to find its twin (i.e., one that satisfies the condition above).
Basically, I am comparing two 3D vectors (one from currentItem, the other from nextItem).
Edit 2
Fixed typo in code example (i.e., from let j = 0 to let j = i + 1)

Here is a linear-time solution to your problem. I am not that familiar with javascript, so I'll feel more comfortable about giving you the algorithm correctly in psuedo-code.
lookup := hashtable()
for i .. myArray.length
twin_id := lookup[myArray[i].tail, myArray[i].head]
if twin_id != null
myArray[i].twin := twin_id
myArray[twin_id].twin := i
else
lookup[myArray[i].head, myArray[i].tail] = i
The idea is to construct a hash table of (head, tail) pairs, and to check if a (tail, head) pair already exists that matches the current node's. If so, they are twins, and mark them as such, otherwise update the hash table with a new entry. Every element is looped over exactly once, and insertion / retrieval from the hash table is done in constant time.

I don't know whether there's any kind of specific algorithm that is more efficient, but the following optimizations come to my mind immediately:
Let j start with i+1 - otherwise you are comparing all items twice
against each other
- Initialize a variable with myArray.length outside
the loops as the same operation is done twice.
If the comparison
is any kind of direct 'equal / larger' then it could help to sort the
array first
Update on Edit 1
I think the optimization depends on the number of expected matches. I.e., if all HalfEdge objects have a twin, then I think you're current approach with the changes above is already pretty optimal.
However, if the percentage of expected twins is rather low, then I would suggest the following:
- Extract a list of all heads and a list of all tails, sort them, and compare against each other. Remember which heads have found a twin tail.
Then, do you original loops again, but only enter the inner loop for the heads which found a match.
Not sure this is optimal, but I hope you get my approach.

Without knowing more information about the type of items
1) You should first sort your array, aftewards the comparisson can be done forward only, it should then give you a complexity of o(log n) + n^2, this could be useful depending on the type of your items and could lead to more improvements.
2) Starting the internal loop from i + 1 should reduce it further to o(log n + n)
const myArray = [ ...some stuff ].sort((a,b) => sortingComparison(a,b)); // sorting comparison must return a number
let currentItem;
let nextItem;
for (let i = 0; i < myArray.length; i++) {
currentItem = myArray[i];
for (let j = i + 1; j < myArray.length; j++) {
nextItem = myArray[j];
doSomeComparision(currentItem, nextItem);
}
}
Bonus:
Here is some fancy functional code (if you are aiming for raw performance the for loops versions are faster)
function compare(value, array) {
array.forEach((nextValue) => {
// Do your coparisson here
// nextValue === value
}
}
const myArray = [items]
myArray
.sort((a,b) => (a-b))
.forEach((v, idx) => compare(v, myArray.slice(idx, myArray.length))

Since values are 3D coordinates, build an octree ( O(N) ) and add items on their HEAD values. Then from each of them, follow them to their TAIL values using already built octree ( O(Nklog(N)) ) with its nodes containing maximum of k edges which means only k comparisons at the lowest level nodes of each TAIL. Also finding each TAIL may need traveling up to log(N) levels of octree from top to bottom.
O(N) with constant of building octree + O(N * k * log(N)) with low enough k edges per node(and logN levels of octree).
When you follow a TAIL in octree, any HEAD with same value would be in same node with maximum k elements or any "close enough" HEAD value would be inside that lowest level node and its closest neighbors.
Are you looking for an exact HEAD==TAIL or some tolerance is used? Tolerance could need "loose octree" imo.
If each edge has a length defined, then you can constrain the search radius by this value, if edges are both ways symmetric.
For up to 5k - 10k edges, there may be only 5-10 levels in octree depending on edges per node limit and if this limit is picked to be around 2-4 then each HEAD would need to do only 10-40 operations to find its twin edge with same TAIL value.

Related

moving from for() to map() - can't get my head around it

Wondering if someone can help - I'm wanting to use Array.map and Array.filter but i'm so stuck in my for loop thinking that despite reading tutorials etc i can't seem to get my head around this.
In this code, I have an Array of objects, I want to:
compare each item in the array with the other items, and ensure that obj[i] != obj[i]
perform operations on current item: check if item.target is null, compare distance between item and item+1, and if item & item+1 distance is smaller than item & item.target then i want to replace item.target with item.
code:
for (var i = 0; i < 111; i++) {
var itm = {x:Math.random()*w, y:Math.random()*h, tgt:null};
dotArr.push(itm);
}
function findTarget(itemA, itemB){
var x1 = itemA.x;
var y1 = itemA.y;
var x2 = itemB.x;
var y2 = itemB.y;
var distance = Math.sqrt( (x2-=x1)*x2 + (y2-=y1)*y2 );
return distance;
}
for (var i = 0; i < dotArr.length; i++) {
let itm = dotArr[i];
for (var j = 0; j < dotArr.length; j++) {
if(itm != dotArr[j]){
let itm2 = this.dotArr[j];
if(itm.tgt==null){
itm.tgt = itm2;
}else{
let newDist = findTarget(itm, itm2);
let curDist = findTarget(itm, itm.tgt);
if(newDist<curDist){
itm.tgt = itm2;
}
}
}
}
}
All the 'multiply each value by 2' examples in the tutorials i read make sense but can't extrapolate that into an approach that i use all the time.
Expected results: i have a bunch of particles, they are looping through a requestAnimationFrame() loop, checking the distance each loop. Each particle finds the closest particle and sets it to 'tgt' (and then moves toward it in other code), but it updates each loop.
Summary
const distance = (a, b) =>
Math.sqrt(Math.pow(b.x - a.x, 2) + Math.pow(b.y - a.y, 2))
const findClosest = (test, particles) => particles.reduce(
({val, dist}, particle) => {
const d = distance(test, particle)
return d < dist && d != 0 ? {val: particle, dist: d} : {val, dist}
},
{val: null, dist: Infinity}
).val
const addTargets = particles => particles.map(particle => {
particle.tgt = findClosest(particle, particles)
return particle
})
(This is hard to do in a snippet because of the cyclic nature of your data structure. JSON stringification doesn't work well with cycles.)
Change style for the right reason
You say you want to change from for-loops to map, filter, et. al., but you don't say why. Make sure you're doing this for appropriate reasons. I am a strong advocate of functional programming, and I generally push junior developers I'm responsible for to make such changes. But I explain the reasons.
Here is the sort of explanation I make:
"When you're doing a loop, you're doing it for a reason. If you are looking to transform a list of values one-by-one into another list of values, then there is a built-in called map which makes your code clearer and simpler. When you're trying to check for those which should be kept, then you have filter, which makes your code clearer and simpler. When you want to find the first item in a list with a certain property, you have find, which, again, is clearer and simpler. And if you are trying to combine the elements until you're reduced them to a single value, you can use reduce, which, surprise, surprise, is cleaner and simpler.
"The reason to use these is to better express the intent of your code. Your intent is pretty well never going to be 'to continually increment the value of some counter starting with some value and ending when some condition is met, performing some routine on each iteration.' If you can use tools that better express your goals, then your code is easier to understand. So look for where map, filter, find, and reduce make sense in your code.
"Not every for-loop fits one of these patterns, but a large subset of them will. Replacing those that do fit will make for more understandable, and therefore more maintainable, code."
I will go on from there to explain the advantages of never worrying about fencepost errors and how some of these functions can work with more generic types, making it easier to reuse such code. But this is the basic gist I use with my teams.
You need to decide why you're changing, and if it makes sense in your case. There is a real possibility, given your requirements, that it doesn't.
The functions map, find, and filter work only on individual items in your list. reduce works on one item and the currently accumulated value. It looks as though your requirement is to word pair-wise across all the values. That might mean that none of these functions is a good fit.
Or perhaps they do. Read on for how I would solve this.
Names are important
You include a function called findTarget. I would assume that such a function somehow or another finds a target. In fact, all it does it to calculate the distance between two items.
Imagine coming to someone else's code and reading through the code that uses findTarget. Until you read that function, you will have no idea that it's simply calculating a distance. The code will seem strange. It will be much harder to understand than if you just named it distance.
Also, using item or the shortened version itm does not tell the reader anything about what these are. (Update: a change to the post points out that these are 'particles', so I will use that rather than itm in the code.)
Avoid trickiness
That findTarget/distance function does something strange, and somewhat difficult to follow. It modifies computation variables in the middle of the computation: (x2-=x1)*x2 and (y2-=y1)*y2. While I can see that this works out the same, it's easy to write a very clear distance function without this trickiness:
const distance = (a, b) =>
Math.sqrt((b.x - a.x) * (b.x - a.x) + (b.y - a.y) * (b.y - a.y))
There are many variants of this that are just as clear.
const distance = (a, b) =>
Math.sqrt(Math.pow(b.x - a.x, 2) + Math.pow(b.y - a.y, 2))
And one day we'll be able to do
const distance = (a, b) => Math.sqrt((b.x - a.x) ** 2 + (b.y - a.y) ** 2)
Any of these would make for much clearer code. You could also use intermediate variables such as dx/dy or deltaX/deltaY if that made it clearer to you.
Look carefully at your requirements
It took me far too long looking at your code to determine what precisely you were trying to do.
If you can break apart the pieces you need into named functions, it's often significantly easier to write, and it's generally much easier for someone else to understand (or even for yourself a few weeks later.)
So, if I understand the problem correctly now, you have a list of positioned objects, and for each one of them you want to update them with a target, that being the object closest to them. That sounds very much like map.
Given that, I think the code should look something like:
const addTargets = particles => particles.map(item => ({
x: item.x,
y: item.y,
tgt: findClosest(item, particles)
}))
Now I don't know how findClosest will work yet, but I expect that this matches the goal if only I could write that.
Note that this version takes seriously my belief in the functional programming concept of immutability. But it won't quite do what you want, because a particle's target will be the one from the old list and not one from its own list. I personally might look at altering the data structure to fix this. But instead, let's ease that restriction and rather than returning new items, we can update items in place.
const addTargets = particles => particles.map(particle => {
particle.tgt = findClosest(particle, particles)
return particle
})
So notice what we're doing here: we're turning a list of items without targets (or with null ones) into a list of items with them. But we break this into two parts: one converts the elements without the targets to ones with them; the second finds the appropriate target for a given element. This more clearly captures the requirements.
We still have to figure out how to find the appropriate target for an element. In the abstract, what we're doing is to take a list of elements and turning it into a single one. That's reduce. (This is not a find operation, since it has to check everything in the list.)
Let's write that, then:
const findClosest = (test, particles) => particles.reduce(
({val, dist}, particle) => {
const d = distance(test, particle)
return d < dist && d != 0 ? {val: particle, dist: d} : {val, dist}
},
{val: null, dist: Infinity}
).val
We use the distance for dual purposes here. First, of course, we're looking at how far apart two particles are. But second, we assume that another particle in the same exact location is the same particle. If that is not accurate, you'll have to alter this a bit.
At each iteration, we have a new object with val and dist properties. And this always represents the closest particle we've found so far and its distance from our current particle. At the end, we just return the val property. (The reaon for Infinity is that every particle will be closer than that, so we don't need specific logic to test the first one.)
Conclusion
In the end we were able to use map and reduce. Note that in this example we have two reusable helper functions, but each is used just once. If you don't need to reuse them, you could fold them into the functions that call them. But I would not recommend it. This code is fairly readable. Folded in, these would be less expressive.
dotArr.map(itemI => {
const closestTarget = dotArr.reduce((currentMax, itemJ) => {
if(currentMax === null){
const targetDistance = findTarget(itemI, itemJ)}
if(targetDistance !== null){
return {item:itemJ, distance:targetDistance};
}
return null;
}
const newDistance = findTarget(itemI, itemJ);
if((currentMax.distance - newDistance) < 0){ //No need to check if it is the same item, because distance is 0
return {item:itemJ, distance: newDistance};
}
return sum;
}, null);
itemI.tgt = closestTarget.item;
return itemI;
}
After constructing this example, i found that you are using a very complex example to figure out how map works.
Array.map is typically used for one value, so we can use it for [i], then we need to iterate all the other values in the array using [j], but we can't do this with map, because we only care about the closest [j], so we can use Array.reduce which also is an accumulator like Array.map, but the end result is whatever you want it to be, while the end result of Array.map always is an array of same length.
What my reduce function does is that it iterates through the entire list, similar to [j]. I initialize the currentMax as null, so when j==0 then currentMax===null, Then i figure out what state [j] is compared to [i]. The return statements is what currentMax will be equal to in [j+1]
When i finally found the closest target, i can just add it so itemI.tgt and i have to return it, so that the new map knows what the item looks like at the current index.
Without looking at Array.map this is how i imagine it is implemented
function myMap(inputArray, callback){
const newArray = [];
for(let i=0;i<inputArray.length;i++){
newArray.push(callback(inputArray[i], i, inputArray));
}
return newArray;
}
So this is why you always need to write return
I think in this instance you want to use reduce and NOT map.
reduce can allow you to "Reduce" an array of items to a single item.
https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Array/reduce
example
let largestThing = arrayOfThings.reduce(function (largest, nextItem) {
if (largest == null) {
return nextItem;
}
if (largest.prop > nextItem.prop){
return largest;
}
return nextItem;
}, null);
null as the parameter of the callback is the starting "largest" in the callback.

Javascript generating variations with exclusion from only one array [duplicate]

A small application that I have written allows a user to add various items to two arrays. Some logic calculates a figure from the contents of each array.
Any items in array x can be placed into array y, and back again. Items belonging in array y can never be moved (unless they were moved from array x).
The user can move these items around in two lists using a simple javascript ui. To make things simpler, I originally made a naive script which:
Moved an item from a to y.
Performed some logic using this 'possibility'
If the result was less than before, leave x in y.
If not, then x remains in x.
Move on to next item in x and repeat.
I knew that this was ineffective. I have read around and have been told do this using bitwise math to remember the possibilities or 'permutations' but I am struggling to get my head around this particular problem at this stage.
If anyone would be able to explain (pseudo code is fine) what would be the best way to achieve the following I would be very grateful.
array x = [100,200,300,400,500]
array y = [50,150,350,900]
With these two arrays, for each value from x, push every combination of that value and all the other values from x into array y. For each one I will perform some logic (i.e. test result and store this 'permutation' in an array (an object of two arrays representing x and y). I foresee this as being quite expensive with large arrays, likely to repeat a lot of combinations. I feel like I'm almost there, but lost at this last stage.
Sorry for the long explanation, and thanks in advance!
Use this for creating the power set of x:
function power(x, y) {
var r = [y || []], // an empty set/array as fallback
l = 1;
for (var i=0; i<x.length; l=1<<++i) // OK, l is just r[i].length, but this looks nicer :)
for (var j=0; j<l; j++) {
r.push(r[j].slice(0)); // copy
r[j].push(x[i]);
}
return r;
}
Usage:
> power([0,2], [5,6])
[[5,6,0,2], [5,6,2], [5,6,0], [5,6]]
I have been told do this using bitwise math to remember the possibilities or 'permutations' but I am struggling to get my head around this particular problem at this stage.
It would be iterating to 2n (for an array of length n), using single bits to determine whether an item should be included in the subset. Example for an array [a,b]:
i binary included in set
-----------------------------
0 00 { }
1 01 { b }
2 10 { a }
3 11 { a, b }
We can use bitwise operators in JS for arrays with up to 31 items (which should be enough).
function power(x, y) {
var l = Math.pow(2, x.length),
r = new Array(l);
for (var i=0; i<l; i++) {
var sub = y ? y.slice(0) : [];
for (var j=0; j<x.length; j++)
// if the jth bit from the right is set in i
if (i & Math.pow(2,j)) // Math.pow(2,j) === 1<<j
sub.push(x[j]);
r[i] = sub;
}
return r;
}

Arrays Inside Objects: Conditional Updating Of One Object's Array From Another Object's Array

DISCLAIMER
I have absolutely no idea how to succinctly describe the nature of the problem I am trying to solve without going deep into context. It took me forever to even think of an appropriate title. For this reason I've found it nearly impossible to find an answer both on here and the web at large that will assist me. It's possible my question can be distilled down into something simple which does already have an answer on here. If this is the case I apologise for the elaborate duplicate
TL;DR
I have two arrays: a main array members and a destination array neighbours (technically many destination arrays but this is the tl;dr). The main array is a property of my custom group object which is auto-populated with custom ball objects. The destination array is a property of my custom ball object. I need to scan each element inside of the members array and calculate distance between that element and every other element in the members group. If there exist other elements within a set distance of the current element then these other elements need to be copied into the current element's destination array. This detection needs to happen in realtime. When two elements become close enough to be neighbours they are added to their respective neighbours array. The moment they become too far apart to be considered neighbours they need to be removed from their respective neighbours array.
CONTEXT
My question is primarily regarding array iteration, comparison and manipulation but to understand my exact dilemma I need to provide some context. My contextual code snippets have been made as brief as possible. I am using the Phaser library for my project, but my question is not Phaser-dependent.
I have made my own object called Ball. The object code is:
Ball = function Ball(x, y, r, id) {
this.position = new Vector(x, y); //pseudocode Phaser replacement
this.size = r;
this.id = id;
this.PERCEPTION = 100;
this.neighbours = []; //the destination array this question is about
}
All of my Ball objects (so far) reside in a group. I have created a BallGroup object to place them in. The relevant BallGroup code is:
BallGroup = function BallGroup(n) { //create n amount of Balls
this.members = []; //the main array I need to iterate over
/*fill the array with n amount of balls upon group creation*/
for (i = 0; i < n; i++) {
/*code for x, y, r, id generation not included for brevity*/
this.members.push(new Ball(_x, _y, _r, _i)
}
}
I can create a group of 4 Ball objects with the following:
group = new BallGroup(4);
This works well and with the Phaser code I haven't included I can click/drag/move each Ball. I also have some Phaser.utils.debug.text(...) code which displays the distance between each Ball in an easy to read 4x4 table (with duplicates of course as distance Ball0->Ball3 is the same as distance Ball3->Ball0). For the text overlay I calculate the distance with a nested for loop:
for (a = 0; a < group.members.length; a++) {
for (b = 0; b < group.members.length; b++) {
distance = Math.floor(Math.sqrt(Math.pow(Math.abs(group.members[a].x - group.members[b].x), 2) + Math.pow(Math.abs(group.members[a].y - group.members[b].y), 2)));
//Phaser text code
}
}
Now to the core of my problem. Each Ball has a range of detection PERCEPTION = 100. I need to iterate over every group.members element and calculate the distance between that element (group.members[a]) and every other element within the group.members array (this calculation I can do). The problem I have is I cannot then copy those elements whose distance to group.members[a] is < PERCEPTION into the group.members[a].neighbours array.
The reason I have my main array (BallGroup.members) inside one object and my destination array inside a different object (Ball.neighbours) is because I need each Ball within a BallGroup to be aware of it's own neighbours without caring for what the neighbours are for every other Ball within the BallGroup. However, I believe that the fact these two arrays (main and destination) are within different objects is why I am having so much difficulty.
But there is a catch. This detection needs to happen in realtime and when two Balls are no longer within the PERCEPTION range they must then be removed from their respective neighbours array.
EXAMPLE
group.members[0] -> no neighbours
group.members[1] -> in range of [2] and [3]
group.members[2] -> in range of [1] only
group.members[3] -> in range of [1] only
//I would then expect group.members[1].neighbours to be an array with two entries,
//and both group.members[2].neighbours and group.members[3].neighbours to each
//have the one entry. group.members[0].neighbours would be empty
I drag group.members[2] and group.members[3] away to a corner by themselves
group.members[0] -> no neighbours
group.members[1] -> no neighbours
group.members[2] -> in range of [3] only
group.members[3] -> in range of [2] only
//I would then expect group.members[2].neighbours and group.members[3].neighbours
//to be arrays with one entry. group.members[1] would change to have zero entries
WHAT I'VE TRIED
I've tried enough things to confuse any person, which is why I'm coming here for help. I first tried complex nested for loops and if/else statements. This resulted in neighbours being infinitely added and started to become too complex for me to keep track of.
I looked into Array.forEach and Array.filter. I couldn't figure out if forEach could be used for what I needed and I got very excited learning about what filter does (return an array of elements that match a condition). When using Array.filter it either gives the Ball object zero neighbours or includes every other Ball as a neighbour regardless of distance (I can't figure out why it does what it does, but it definitely isn't what I need it to do). At the time of writing this question my current code for detecting neighbours is this:
BallGroup = function BallGroup(n) {
this.members = []; //the main array I need to iterate over
//other BallGroup code here
this.step = function step() { //this function will run once per frame
for (a = 0; a < this.members.length; a++) { //members[a] to be current element
for (b = 0; b < this.members.length; b++) { //members[b] to be all other elements
if (a != b) { //make sure the same element isn't being compared against itself
var distance = Math.sqrt(Math.pow(Math.abs(this.members[a].x - this.members[b].x), 2) + Math.pow(Math.abs(this.members[a].y - this.members[b].y), 2));
function getNeighbour(element, index, array) {
if (distance < element.PERCEPTION) {
return true;
}
}
this.members[a].neighbours = this.members.filter(getNeighbour);
}
}
}
}
}
I hope my problem makes sense and is explained well enough. I know exactly what I need to do in the context of my own project, but putting that into words for others to understand who have no idea about my project has been a challenge. I'm learning Javascript as I go and have been doing great so far, but this particular situation has me utterly lost. I'm in too deep, but I don't want to give up - I want to learn!
Many, many, many thanks for those who took the time read my very long post and tried provide some insight.
edit: changed a > to a <
I was learning more on object literals, I'm trying to learn JS to ween myself off of my jQuery dependency. I'm making a simple library and I made a function that adds properties of one object to another object. It's untested, but I think if you were apply something similar it might help. I'll try to find my resources. Btw, I don't have the articles on hand right now, but I recall that using new could incur complications, sorry I can't go any further than that, I'll post more info as I find it.
xObject could be the ball group
Obj2 could be the members
Obj1 could be the destination
/* augment(Obj1, Obj2) | Adds properties of Obj2 to Obj1. */
// xObject has augment() as a method called aug
var xObject = {
aug: augument
}
/* Immediately-Invoked Function Expression (IIFE) */
(function() {
var Obj1 = {},
Obj2 = {
bool: true,
num: 3,
str: "text"
}
xObject.aug(Obj1, Obj2);
}()); // invoke immediately
function augment(Obj1, Obj2) {
var prop;
for (prop in Obj2) {
if (Obj2.hasOwnProperty(prop) && !Obj1[prop]) {
Obj1[prop] = Obj2[prop];
}
}
}

How to split an array into two subsets and keep sum of sub-values of array as equal as possible

I really need an master of algorithm here! So the thing is I got for example an array like this:
[
[870, 23]
[970, 78]
[110, 50]
]
and I want to split it up, so that it looks like this:
// first array
[
[970, 78]
]
// second array
[
[870, 23]
[110, 50]
]
so now, why do I want it too look like this?
Because I want to keep the sum of sub values as equal as possible. So 970 is about 870 + 110 and 78 is about 23 + 50.
So in this case it's very easy because if you would just split them and only look at the first sub-value it will already be correct but I want to check both and keep them as equal as possible, so that it'll also work with an array which got 100 sub-arrays! So if anyone can tell me the algorithm with which I can program this it would be really great!
Scales:
~1000 elements (sublists) in the array
Elements are integers up to 10^9
I am looking for a "close enough solution" - it does not have to be the exact optimal solution.
First, as already established - the problem is NP-Hard, with a reduction form Partition Problem.
Reduction:
Given an instance of partition problem, create lists of size 1 each. The result will be this problem exactly.
Conclusion from the above:
This problem is NP-Hard, and there is no known polynomial solution.
Second, Any exponential and pseudo polynomial solutions will take just too long to work, due to the scale of the problem.
Third, It leaves us with heuristics and approximation algorithms.
I suggest the following approach:
Normalize the scales of the sublists, so all the elements will be in the same scale (say, all will be normalzied to range [-1,1] or all will be normalized to standard normal distribution).
Create a new list, in which, each element will be the sum of the matching sublist in the normalized list.
Use some approximation or heuristical solution that was developed for the subset-sum / partition problem.
The result will not be optimal, but optimal is really unattanable here.
From what I gather from the discussion under the original post, you're not searching for a single splitting point, but rather you want to distribute all pairs among two sets, such that the sums in each of the two sets are approximately equal.
Since a close enough solution is acceptable, maybe you could try an approach based on simulated annealing?
(see http://en.wikipedia.org/wiki/Simulated_annealing)
In short, the idea is that you start out by randomly assigning each pair to either the Left or the Right set.
Next, you generate a new state by either
a) moving a randomly selected pair from the Left to the Right set,
b) moving a randomly selected pair
from the Right to the Left set, or
c) doing both.
Next, determine if this new state is better or worse than the current state. If it is better, use it.
If it is worse, take it only if it is accepted by the acceptance probability function, which is a function
that initially allows worse states to be used, but favours them less and less as time moves on (or the "temperature decreases", in SA terms).
After a large number of iterations (say 100.000), you should have a pretty good result.
Optionally, rerun this algorithm multiple times because it may get stuck in local optima (although the acceptance probability function attempts to counter this).
Advantages of this approach are that it's simple to implement, and you can decide for yourself how long
you want it to continue searching for a better solution.
I'm assuming that we're just looking for a place in the middle of the array to split it into its first and second part.
It seems like a linear algorithm could do this. Something like this in JavaScript.
arrayLength = 2;
tolerance = 10;
// Initialize the two sums.
firstSum = [];
secondSum = [];
for (j = 0; j < arrayLength; j++)
{
firstSum[j] = 0;
secondSum[j] = 0;
for (i = 0; i < arrays.length; i++)
{
secondSum += arrays[i][j];
}
}
// Try splitting at every place in "arrays".
// Try to get the sums as close as possible.
for (i = 0; i < arrays.length; i++)
{
goodEnough = true;
for (j = 0; j < arrayLength; j++)
{
if (Math.abs(firstSum[j] - secondSum[j]) > tolerance)
goodEnough = false;
}
if (goodEnough)
{
alert("split before index " + i);
break;
}
// Update the sums for the new position.
for (j = 0; j < arrayLength; j++)
{
firstSum[j] += arrays[i][j];
secondSum[j] -= arrays[i][j];
}
}
Thanks for all the answers, the bruteforce attack was a good idea and NP-Hard is related to this too, but it turns out that this is a multiple knapsack problem and can be solved using this pdf document.

What's the time complexity of array.splice() in Google Chrome?

If I remove one element from an array using splice() like so:
arr.splice(i, 1);
Will this be O(n) in the worst case because it shifts all the elements after i? Or is it constant time, with some linked list magic underneath?
Worst case should be O(n) (copying all n-1 elements to new array).
A linked list would be O(1) for a single deletion.
For those interested I've made this lazily-crafted benchmark. (Please don't run on Windows XP/Vista). As you can see from this though, it looks fairly constant (i.e. O(1)), so who knows what they're doing behind the scenes to make this crazy-fast. Note that regardless, the actual splice is VERY fast.
Rerunning an extended benchmark directly in the V8 shell that suggest O(n). Note though that you need huge array sizes to get a runtime that's likely to affect your code. This should be expected as if you look at the V8 code it uses memmove to create the new array.
The Test:
I took the advice in the comments and wrote a simple test to time splicing a data-set array of size 3,000, each one containing 3,000 items in it. The test would simply splice the
first item in the first array
second item in the second array
third item in the third array
...
3000th item in the 3000th array
I pre-built the array to keep things simple.
The Findings:
The weirdest thing is that the number of times where the process of the splice even takes longer than 1ms grows linearly as you increase the size of the dataset.
I went as far as testing it for a dataset of 300,000 on my machine (but the SO snippet tends to crash after 3,000).
I also noticed that the number of splice()s that took longer than 1ms for a given dataset (30,000 in my case) was random. So I ran the test 1,000 times and plotted the number of results, and it looked like a standard distribution; leading me to believe that the randomness was just caused by the scheduler interrupts.
This goes against my hypothesis and #Ivan's guess that splice()ing from the beginning of an array will have a O(n) time complexity
Below is my test:
let data = []
const results = []
const dataSet = 3000
function spliceIt(i) {
data[i].splice(i, 1)
}
function test() {
for (let i=0; i < dataSet; i++) {
let start = Date.now()
spliceIt(i);
let end = Date.now()
results.push(end - start)
}
}
function setup() {
data = (new Array(dataSet)).fill().map(arr => new Array(dataSet).fill().map(el => 0))
}
setup()
test()
// console.log("data before test", data)
// console.log("data after test", data)
// console.log("all results: ", results)
console.log("results that took more than 1ms: ", results.filter(r => r >= 1))
¡Hi!
I did an experiment myself and would like to share my findings. The experiment was very simple, we ran 100 splice operations on an array of size n, and calculate the average time each splice function took. Then we varied the size of n, to check how it behave.
This graph summarizes our findings for big numbers:
For big numbers it seems to behave linearly.
We also checked with "small" numbers (they were still quite big but not as big):
On this case it seems to be constant.
If I would have to decide for one option I would say it is O(n), because that is how it behaves for big numbers. Bear in mind though, that the linear behaviour only shows for VERY big numbers.
However, It is hard to go for a definitive answer because the array implementation in javascript dependes A LOT on how the array is declared and manipulated.
I recommend this stackoverflow discussion and this quora discussion to understand how arrays work.
I run it in node v10.15.3 and the code used is the following:
const f = async () => {
const n = 80000000;
const tries = 100;
const array = [];
for (let i = 0; i < n; i++) { // build initial array
array.push(i);
}
let sum = 0;
for (let i = 0; i < tries; i++) {
const index = Math.floor(Math.random() * (n));
const start = new Date();
array.splice(index, 1); // UNCOMMENT FOR OPTION A
// array.splice(index, 0, -1); // UNCOMMENT FOR OPTION B
const time = new Date().getTime() - start.getTime();
sum += time;
array.push(-2); // UNCOMMENT FOR OPTION A, to keep it of size n
// array.pop(); // UNCOMMENT FOR OPTION B, to keep it of size n
}
console.log('for an array of size', n, 'the average time of', tries, 'splices was:', sum / tries);
};
f();
Note that the code has an Option B, we did the same experiment for the three argument splice function to insert an element. It worked similary.

Categories

Resources