A small application that I have written allows a user to add various items to two arrays. Some logic calculates a figure from the contents of each array.
Any items in array x can be placed into array y, and back again. Items belonging in array y can never be moved (unless they were moved from array x).
The user can move these items around in two lists using a simple javascript ui. To make things simpler, I originally made a naive script which:
Moved an item from a to y.
Performed some logic using this 'possibility'
If the result was less than before, leave x in y.
If not, then x remains in x.
Move on to next item in x and repeat.
I knew that this was ineffective. I have read around and have been told do this using bitwise math to remember the possibilities or 'permutations' but I am struggling to get my head around this particular problem at this stage.
If anyone would be able to explain (pseudo code is fine) what would be the best way to achieve the following I would be very grateful.
array x = [100,200,300,400,500]
array y = [50,150,350,900]
With these two arrays, for each value from x, push every combination of that value and all the other values from x into array y. For each one I will perform some logic (i.e. test result and store this 'permutation' in an array (an object of two arrays representing x and y). I foresee this as being quite expensive with large arrays, likely to repeat a lot of combinations. I feel like I'm almost there, but lost at this last stage.
Sorry for the long explanation, and thanks in advance!
Use this for creating the power set of x:
function power(x, y) {
var r = [y || []], // an empty set/array as fallback
l = 1;
for (var i=0; i<x.length; l=1<<++i) // OK, l is just r[i].length, but this looks nicer :)
for (var j=0; j<l; j++) {
r.push(r[j].slice(0)); // copy
r[j].push(x[i]);
}
return r;
}
Usage:
> power([0,2], [5,6])
[[5,6,0,2], [5,6,2], [5,6,0], [5,6]]
I have been told do this using bitwise math to remember the possibilities or 'permutations' but I am struggling to get my head around this particular problem at this stage.
It would be iterating to 2n (for an array of length n), using single bits to determine whether an item should be included in the subset. Example for an array [a,b]:
i binary included in set
-----------------------------
0 00 { }
1 01 { b }
2 10 { a }
3 11 { a, b }
We can use bitwise operators in JS for arrays with up to 31 items (which should be enough).
function power(x, y) {
var l = Math.pow(2, x.length),
r = new Array(l);
for (var i=0; i<l; i++) {
var sub = y ? y.slice(0) : [];
for (var j=0; j<x.length; j++)
// if the jth bit from the right is set in i
if (i & Math.pow(2,j)) // Math.pow(2,j) === 1<<j
sub.push(x[j]);
r[i] = sub;
}
return r;
}
Related
I have an array of items, and for each of the item in the array, I need to do some check against the rest of the items in the same array.
Here is the code I am using:
const myArray = [ ...some stuff ];
let currentItem;
let nextItem;
for (let i = 0; i < myArray.length; i++) {
currentItem = myArray[i];
for (let j = i + 1; j < myArray.length; j++) {
nextItem = myArray[j];
doSomeComparision(currentItem, nextItem);
}
}
While this works, I need to find a more efficient algorithm because it slows down significantly if the array is very big.
Can someone provide some advice on how to make this algorithm better?
Edit 1
I apologize.
I should have provided more context around what I am trying to do here.
I am using the loop above with a HalfEdge data structure, a.k.a. DCEL.
Basically, a HalfEdge is an object with 3 properties:
class HalfEdge = {
head: // some (x,y,z) coords
tail: // some (x,y,z) coords
twin: // reference to another HalfEdge
}
A twin of a given HalfEdge is defined like so:
/**
* if two Half-Edges are twins:
* Edge A TAIL ----> HEAD
* = =
* Edge B HEAD <---- TAIL
*/
My array contains many HalfEdges, and for each HalfEdge in the array, I want to find its twin (i.e., one that satisfies the condition above).
Basically, I am comparing two 3D vectors (one from currentItem, the other from nextItem).
Edit 2
Fixed typo in code example (i.e., from let j = 0 to let j = i + 1)
Here is a linear-time solution to your problem. I am not that familiar with javascript, so I'll feel more comfortable about giving you the algorithm correctly in psuedo-code.
lookup := hashtable()
for i .. myArray.length
twin_id := lookup[myArray[i].tail, myArray[i].head]
if twin_id != null
myArray[i].twin := twin_id
myArray[twin_id].twin := i
else
lookup[myArray[i].head, myArray[i].tail] = i
The idea is to construct a hash table of (head, tail) pairs, and to check if a (tail, head) pair already exists that matches the current node's. If so, they are twins, and mark them as such, otherwise update the hash table with a new entry. Every element is looped over exactly once, and insertion / retrieval from the hash table is done in constant time.
I don't know whether there's any kind of specific algorithm that is more efficient, but the following optimizations come to my mind immediately:
Let j start with i+1 - otherwise you are comparing all items twice
against each other
- Initialize a variable with myArray.length outside
the loops as the same operation is done twice.
If the comparison
is any kind of direct 'equal / larger' then it could help to sort the
array first
Update on Edit 1
I think the optimization depends on the number of expected matches. I.e., if all HalfEdge objects have a twin, then I think you're current approach with the changes above is already pretty optimal.
However, if the percentage of expected twins is rather low, then I would suggest the following:
- Extract a list of all heads and a list of all tails, sort them, and compare against each other. Remember which heads have found a twin tail.
Then, do you original loops again, but only enter the inner loop for the heads which found a match.
Not sure this is optimal, but I hope you get my approach.
Without knowing more information about the type of items
1) You should first sort your array, aftewards the comparisson can be done forward only, it should then give you a complexity of o(log n) + n^2, this could be useful depending on the type of your items and could lead to more improvements.
2) Starting the internal loop from i + 1 should reduce it further to o(log n + n)
const myArray = [ ...some stuff ].sort((a,b) => sortingComparison(a,b)); // sorting comparison must return a number
let currentItem;
let nextItem;
for (let i = 0; i < myArray.length; i++) {
currentItem = myArray[i];
for (let j = i + 1; j < myArray.length; j++) {
nextItem = myArray[j];
doSomeComparision(currentItem, nextItem);
}
}
Bonus:
Here is some fancy functional code (if you are aiming for raw performance the for loops versions are faster)
function compare(value, array) {
array.forEach((nextValue) => {
// Do your coparisson here
// nextValue === value
}
}
const myArray = [items]
myArray
.sort((a,b) => (a-b))
.forEach((v, idx) => compare(v, myArray.slice(idx, myArray.length))
Since values are 3D coordinates, build an octree ( O(N) ) and add items on their HEAD values. Then from each of them, follow them to their TAIL values using already built octree ( O(Nklog(N)) ) with its nodes containing maximum of k edges which means only k comparisons at the lowest level nodes of each TAIL. Also finding each TAIL may need traveling up to log(N) levels of octree from top to bottom.
O(N) with constant of building octree + O(N * k * log(N)) with low enough k edges per node(and logN levels of octree).
When you follow a TAIL in octree, any HEAD with same value would be in same node with maximum k elements or any "close enough" HEAD value would be inside that lowest level node and its closest neighbors.
Are you looking for an exact HEAD==TAIL or some tolerance is used? Tolerance could need "loose octree" imo.
If each edge has a length defined, then you can constrain the search radius by this value, if edges are both ways symmetric.
For up to 5k - 10k edges, there may be only 5-10 levels in octree depending on edges per node limit and if this limit is picked to be around 2-4 then each HEAD would need to do only 10-40 operations to find its twin edge with same TAIL value.
I came across something quite weird in my application; one loop seems to process an array of 300000 items just fine, while another loop seems to process an array like this at 1/20th of the speed.
These are my loops:
let alength = allsetcards.length; // usually +/- 300000 items in this array, loop takes +/- 3 seconds
for (let y = 0; y < alength; y++) { // this is the faster loop
console.log("aloop: " + y);
let i = allsetcards[y];
setassetids.push(i.assetid); // this is basically .map but as a for loop because its faster
}
let ilength = currentinv.length; // 300000 - 400000 items in this array, loop takes longer than 1 minute
for (let y = 0; y < ilength; y++) { // this is the terribly slow loop
console.log("iloop " + y);
let i = currentinv[y], // array of objects with with 6 keys/values
assetid = i.assetid; // thought this might make it somewhat faster
if (!sets[i.game] || setassetids.indexOf(assetid) < 0) nosets.push(i); // filtering some stuff out (because .filter is even slower)
}
It seems very strange that indexOf or .push(object) would take so much longer, so I think something else that I haven't quite figured out yet might be wrong.
What I have tried to fix this is filling the nosets array with all items that are being filtered in the slow loop, then using .splice(y, 1) to filter it. This didn't help at all.
I really hope someone knows why exactly this is happening!
Thanks, Tim
SOLVED: I was using .indexOf in the second loop, causing it to have to do 300000^2 times as much work as the first loop, what I did to prevent this from happening was make setassetids an object, with keys being the assetid and the values being a true value so I can use setassetids[assetid] to check whether it exists or not! (thanks to #Fefux and #t.niese)
A node process of mine receives a sample point every half a second, and I want to update the history chart of all the sample points I receive.
The chart should be an array which contains the downsampled history of all points from 0 to the current point.
In other words, the maximum length of the array should be l. If I received more sample points than l, I want the chart array to be a downsampled-to-l version of the whole history.
To express it with code:
const CHART_LENGTH = 2048
createChart(CHART_LENGTH)
onReceivePoint = function(p) {
// p can be considered a number
const chart = addPointToChart(p)
// chart is an array representing all the samples received, from 0 to now
console.assert(chart.length <= CHART_LENGTH)
}
I already have a working downsampling function with number arrays:
function downsample (arr, density) {
let i, j, p, _i, _len
const downsampled = []
for (i = _i = 0, _len = arr.length; _i < _len; i = ++_i) {
p = arr[i]
j = ~~(i / arr.length * density)
if (downsampled[j] == null) downsampled[j] = 0
downsampled[j] += Math.abs(arr[i] * density / arr.length)
}
return downsampled
}
One trivial way of doing this would obviously be saving all the points I receive into an array, and apply the downsample function whenever the array grows. This would work, but, since this piece of code would run in a server, possibly for months and months in a row, it would eventually make the supporting array grow so much that the process would go out of memory.
The question is: Is there a way to construct the chart array re-using the previous contents of the chart itself, to avoid mantaining a growing data structure? In other words, is there a constant memory complexity solution to this problem?
Please note that the chart must contain the whole history since sample point #0 at any moment, so charting the last n points would not be acceptable.
The only operation that does not distort the data and that can be used several times is aggregation of an integer number of adjacent samples. You probably want 2.
More specifically: If you find that adding a new sample will exceed the array bounds, do the following: Start at the beginning of the array and average two subsequent samples. This will reduce the array size by 2 and you have space to add new samples. Doing so, you should keep track of the current cluster size c(the amount of samples that constitute one entry in the array). You start with one. Every reduction multiplies the cluster size by two.
Now the problem is that you cannot add new samples directly to the array any more because they have a completely different scale. Instead, you should average the next c samples to a new entry. It turns out that it is sufficient to store the number of samples n in the current cluster to do this. So if you add a new sample s, you would do the following.
n++
if n = 1
append s to array
else
//update the average
last array element += (s - last array element) / n
if n = c
n = 0 //start a new cluster
So the memory that you actually need is the following:
the history array with predefined length
the number of elements in the history array
the current cluster size c
the number of elements in the current cluster n
The size of the additional memory does not depend on the total number of samples, hence O(1).
DISCLAIMER
I have absolutely no idea how to succinctly describe the nature of the problem I am trying to solve without going deep into context. It took me forever to even think of an appropriate title. For this reason I've found it nearly impossible to find an answer both on here and the web at large that will assist me. It's possible my question can be distilled down into something simple which does already have an answer on here. If this is the case I apologise for the elaborate duplicate
TL;DR
I have two arrays: a main array members and a destination array neighbours (technically many destination arrays but this is the tl;dr). The main array is a property of my custom group object which is auto-populated with custom ball objects. The destination array is a property of my custom ball object. I need to scan each element inside of the members array and calculate distance between that element and every other element in the members group. If there exist other elements within a set distance of the current element then these other elements need to be copied into the current element's destination array. This detection needs to happen in realtime. When two elements become close enough to be neighbours they are added to their respective neighbours array. The moment they become too far apart to be considered neighbours they need to be removed from their respective neighbours array.
CONTEXT
My question is primarily regarding array iteration, comparison and manipulation but to understand my exact dilemma I need to provide some context. My contextual code snippets have been made as brief as possible. I am using the Phaser library for my project, but my question is not Phaser-dependent.
I have made my own object called Ball. The object code is:
Ball = function Ball(x, y, r, id) {
this.position = new Vector(x, y); //pseudocode Phaser replacement
this.size = r;
this.id = id;
this.PERCEPTION = 100;
this.neighbours = []; //the destination array this question is about
}
All of my Ball objects (so far) reside in a group. I have created a BallGroup object to place them in. The relevant BallGroup code is:
BallGroup = function BallGroup(n) { //create n amount of Balls
this.members = []; //the main array I need to iterate over
/*fill the array with n amount of balls upon group creation*/
for (i = 0; i < n; i++) {
/*code for x, y, r, id generation not included for brevity*/
this.members.push(new Ball(_x, _y, _r, _i)
}
}
I can create a group of 4 Ball objects with the following:
group = new BallGroup(4);
This works well and with the Phaser code I haven't included I can click/drag/move each Ball. I also have some Phaser.utils.debug.text(...) code which displays the distance between each Ball in an easy to read 4x4 table (with duplicates of course as distance Ball0->Ball3 is the same as distance Ball3->Ball0). For the text overlay I calculate the distance with a nested for loop:
for (a = 0; a < group.members.length; a++) {
for (b = 0; b < group.members.length; b++) {
distance = Math.floor(Math.sqrt(Math.pow(Math.abs(group.members[a].x - group.members[b].x), 2) + Math.pow(Math.abs(group.members[a].y - group.members[b].y), 2)));
//Phaser text code
}
}
Now to the core of my problem. Each Ball has a range of detection PERCEPTION = 100. I need to iterate over every group.members element and calculate the distance between that element (group.members[a]) and every other element within the group.members array (this calculation I can do). The problem I have is I cannot then copy those elements whose distance to group.members[a] is < PERCEPTION into the group.members[a].neighbours array.
The reason I have my main array (BallGroup.members) inside one object and my destination array inside a different object (Ball.neighbours) is because I need each Ball within a BallGroup to be aware of it's own neighbours without caring for what the neighbours are for every other Ball within the BallGroup. However, I believe that the fact these two arrays (main and destination) are within different objects is why I am having so much difficulty.
But there is a catch. This detection needs to happen in realtime and when two Balls are no longer within the PERCEPTION range they must then be removed from their respective neighbours array.
EXAMPLE
group.members[0] -> no neighbours
group.members[1] -> in range of [2] and [3]
group.members[2] -> in range of [1] only
group.members[3] -> in range of [1] only
//I would then expect group.members[1].neighbours to be an array with two entries,
//and both group.members[2].neighbours and group.members[3].neighbours to each
//have the one entry. group.members[0].neighbours would be empty
I drag group.members[2] and group.members[3] away to a corner by themselves
group.members[0] -> no neighbours
group.members[1] -> no neighbours
group.members[2] -> in range of [3] only
group.members[3] -> in range of [2] only
//I would then expect group.members[2].neighbours and group.members[3].neighbours
//to be arrays with one entry. group.members[1] would change to have zero entries
WHAT I'VE TRIED
I've tried enough things to confuse any person, which is why I'm coming here for help. I first tried complex nested for loops and if/else statements. This resulted in neighbours being infinitely added and started to become too complex for me to keep track of.
I looked into Array.forEach and Array.filter. I couldn't figure out if forEach could be used for what I needed and I got very excited learning about what filter does (return an array of elements that match a condition). When using Array.filter it either gives the Ball object zero neighbours or includes every other Ball as a neighbour regardless of distance (I can't figure out why it does what it does, but it definitely isn't what I need it to do). At the time of writing this question my current code for detecting neighbours is this:
BallGroup = function BallGroup(n) {
this.members = []; //the main array I need to iterate over
//other BallGroup code here
this.step = function step() { //this function will run once per frame
for (a = 0; a < this.members.length; a++) { //members[a] to be current element
for (b = 0; b < this.members.length; b++) { //members[b] to be all other elements
if (a != b) { //make sure the same element isn't being compared against itself
var distance = Math.sqrt(Math.pow(Math.abs(this.members[a].x - this.members[b].x), 2) + Math.pow(Math.abs(this.members[a].y - this.members[b].y), 2));
function getNeighbour(element, index, array) {
if (distance < element.PERCEPTION) {
return true;
}
}
this.members[a].neighbours = this.members.filter(getNeighbour);
}
}
}
}
}
I hope my problem makes sense and is explained well enough. I know exactly what I need to do in the context of my own project, but putting that into words for others to understand who have no idea about my project has been a challenge. I'm learning Javascript as I go and have been doing great so far, but this particular situation has me utterly lost. I'm in too deep, but I don't want to give up - I want to learn!
Many, many, many thanks for those who took the time read my very long post and tried provide some insight.
edit: changed a > to a <
I was learning more on object literals, I'm trying to learn JS to ween myself off of my jQuery dependency. I'm making a simple library and I made a function that adds properties of one object to another object. It's untested, but I think if you were apply something similar it might help. I'll try to find my resources. Btw, I don't have the articles on hand right now, but I recall that using new could incur complications, sorry I can't go any further than that, I'll post more info as I find it.
xObject could be the ball group
Obj2 could be the members
Obj1 could be the destination
/* augment(Obj1, Obj2) | Adds properties of Obj2 to Obj1. */
// xObject has augment() as a method called aug
var xObject = {
aug: augument
}
/* Immediately-Invoked Function Expression (IIFE) */
(function() {
var Obj1 = {},
Obj2 = {
bool: true,
num: 3,
str: "text"
}
xObject.aug(Obj1, Obj2);
}()); // invoke immediately
function augment(Obj1, Obj2) {
var prop;
for (prop in Obj2) {
if (Obj2.hasOwnProperty(prop) && !Obj1[prop]) {
Obj1[prop] = Obj2[prop];
}
}
}
I really need an master of algorithm here! So the thing is I got for example an array like this:
[
[870, 23]
[970, 78]
[110, 50]
]
and I want to split it up, so that it looks like this:
// first array
[
[970, 78]
]
// second array
[
[870, 23]
[110, 50]
]
so now, why do I want it too look like this?
Because I want to keep the sum of sub values as equal as possible. So 970 is about 870 + 110 and 78 is about 23 + 50.
So in this case it's very easy because if you would just split them and only look at the first sub-value it will already be correct but I want to check both and keep them as equal as possible, so that it'll also work with an array which got 100 sub-arrays! So if anyone can tell me the algorithm with which I can program this it would be really great!
Scales:
~1000 elements (sublists) in the array
Elements are integers up to 10^9
I am looking for a "close enough solution" - it does not have to be the exact optimal solution.
First, as already established - the problem is NP-Hard, with a reduction form Partition Problem.
Reduction:
Given an instance of partition problem, create lists of size 1 each. The result will be this problem exactly.
Conclusion from the above:
This problem is NP-Hard, and there is no known polynomial solution.
Second, Any exponential and pseudo polynomial solutions will take just too long to work, due to the scale of the problem.
Third, It leaves us with heuristics and approximation algorithms.
I suggest the following approach:
Normalize the scales of the sublists, so all the elements will be in the same scale (say, all will be normalzied to range [-1,1] or all will be normalized to standard normal distribution).
Create a new list, in which, each element will be the sum of the matching sublist in the normalized list.
Use some approximation or heuristical solution that was developed for the subset-sum / partition problem.
The result will not be optimal, but optimal is really unattanable here.
From what I gather from the discussion under the original post, you're not searching for a single splitting point, but rather you want to distribute all pairs among two sets, such that the sums in each of the two sets are approximately equal.
Since a close enough solution is acceptable, maybe you could try an approach based on simulated annealing?
(see http://en.wikipedia.org/wiki/Simulated_annealing)
In short, the idea is that you start out by randomly assigning each pair to either the Left or the Right set.
Next, you generate a new state by either
a) moving a randomly selected pair from the Left to the Right set,
b) moving a randomly selected pair
from the Right to the Left set, or
c) doing both.
Next, determine if this new state is better or worse than the current state. If it is better, use it.
If it is worse, take it only if it is accepted by the acceptance probability function, which is a function
that initially allows worse states to be used, but favours them less and less as time moves on (or the "temperature decreases", in SA terms).
After a large number of iterations (say 100.000), you should have a pretty good result.
Optionally, rerun this algorithm multiple times because it may get stuck in local optima (although the acceptance probability function attempts to counter this).
Advantages of this approach are that it's simple to implement, and you can decide for yourself how long
you want it to continue searching for a better solution.
I'm assuming that we're just looking for a place in the middle of the array to split it into its first and second part.
It seems like a linear algorithm could do this. Something like this in JavaScript.
arrayLength = 2;
tolerance = 10;
// Initialize the two sums.
firstSum = [];
secondSum = [];
for (j = 0; j < arrayLength; j++)
{
firstSum[j] = 0;
secondSum[j] = 0;
for (i = 0; i < arrays.length; i++)
{
secondSum += arrays[i][j];
}
}
// Try splitting at every place in "arrays".
// Try to get the sums as close as possible.
for (i = 0; i < arrays.length; i++)
{
goodEnough = true;
for (j = 0; j < arrayLength; j++)
{
if (Math.abs(firstSum[j] - secondSum[j]) > tolerance)
goodEnough = false;
}
if (goodEnough)
{
alert("split before index " + i);
break;
}
// Update the sums for the new position.
for (j = 0; j < arrayLength; j++)
{
firstSum[j] += arrays[i][j];
secondSum[j] -= arrays[i][j];
}
}
Thanks for all the answers, the bruteforce attack was a good idea and NP-Hard is related to this too, but it turns out that this is a multiple knapsack problem and can be solved using this pdf document.