Subset of permutations in JavaScript with up to~6 duplicates per element - javascript

I have tried to this answer to find all permutations of size K=6 of an array of strings, but the array I'm permuting is way too large (~13,000 elements but I can guarantee most of those will be duplicates), which means I'm getting:
....
re-permuting with 6925/12972
node:internal/console/constructor:257
value: function(streamSymbol, string) {
^
RangeError: Maximum call stack size exceeded
at console.value (node:internal/console/constructor:257:20)
at console.log (node:internal/console/constructor:359:26)
at permute (/home/path/to/code/permutation.js:22:17)
at permute (/home/path/to/code/permutation.js:23:9)
.....
re-permuting with 6924/12972
....
re-permuting with 6918/12972
And then it dies. I guessed that is the recursion that's the problem.
I know that there's at most ~300 unique elements in my input (which is how I know that many of the array elements must be duplicates), but I don't know if that's 10,000 instances of one element, and then the rest are individually unique elements or at least K of each unique element. Because of that I can't just fed them into a Set, and there maybe fewer than K of one element so I can't just create a new input by duplicating the new ones K times.
Here is my slightly modified (for readability and logging only) version of the code from the above linked answer (on second look, this algorithm is far from optimal):
let permArr = [];
let usedChars = [];
let originalLength;
/**
* Subsets of permutations of an array
* #param {any[]} input array of elements to permute
* #param {number} subsetLength size of the subsets to return
* #returns {Array[]} and array of arrays
*/
function permute(input, subsetLength = input.length) {
let index
let ch;
originalLength ??= input.length;
for (index = 0; index < input.length; index++) {
ch = input.splice(index, 1)[0];
usedChars.push(ch);
if (input.length == 0) {
let toAdd = usedChars.slice(0, subsetLength);
// resizing the returned array to size k
if (!permArr.includes(toAdd)) permArr.push(toAdd);
}
console.log(`re-permuting with ${input.length}/${originalLength}`)
permute(input, subsetLength);
input.splice(index, 0, ch);
usedChars.pop();
}
return permArr
};
And I found this answer but I do not follow it at all, and this other answer is similar but still uses recursion.
How can I make this without recursion/more performant so it can handle much larger arrays? I'm using NodeJs, and I'm not averse to using different data types.

I don't know if that's 10,000 instances of one element, and then the rest are individually unique elements or at least K of each unique element. Because of that I can't just fed them into a Set, and there maybe fewer than K of one element so I can't just create a new input by duplicating the new ones K times
So just group and count them. Seems simple enough:
function subsetPermutations(arr, size) {
const counts = {};
for (const el of arr) {
counts[el] = (counts[el] ?? 0) + 1;
}
const unique = Object.keys(counts);
const result = Array.from({length: size});
function* recurse(depth) {
if (depth == size) {
yield result;
} else {
for (const el of unique) {
if (counts[el]) {
result[depth] = el;
counts[el]--;
yield* recurse(depth+1)
counts[el]++;
}
}
}
}
return recurse(0);
}
for (const perm of subsetPermutations(["a", "b", "b", "c", "c", "c"], 3)) {
console.log(perm.join('-'));
}
I have tried to this answer to find all permutations of size K=6, but the array I'm permuting is way too large (~13,000 elements), however I can guarantee I know that there's at most ~300 unique
That's still roughtly 3006 permutations, which is far too many to put them into an array. The function above is designed as an iterator so that you can work on the current result in a loop before it gets mutated in the next iteration, to avoid any allocation overhead, but it still will take too long to generate all of them.
How can I make this without recursion/more performant so it can handle much larger arrays? I'm using NodeJs, and I'm not averse to using different data types.
You can use a Map instead of the object for counts, but I doubt it'll be much faster for just 300 different elements.
Avoiding recursion is unnecessary since it only goes 6 level deep, there won't be a stack overflow unlike with your inefficient solution. But for performance, you might still try this approach of dynamically generating nested loops.

Related

How to map 256 unique strings to the integers (0..255) with a memory-free function

Say I have strings like foo, bar, baz, hello, world, etc. up to 256 unique strings, so not very many. It could just as easily be 200 strings or 32 strings for all intents and purposes. Hopefully the solution could handle arbitrarily sized sets.
So you take that string and somehow map it to an integer 0-255. Without just doing this:
strings[currentString] = ID++
// strings['foo'] = 0
// strings['bar'] = 1
// strings['baz'] = 2
// ...
which would depend on the order they are inserted. Ideally they would be generated uniquely perhaps from a hash of the individual characters or bytes somehow, I'm not sure. But it would be a function without memory that takes an arbitrary string from a set of known size and maps it to an integer, so more like:
// strings['foo'] = 6 + 15 + 15 = 36
// strings['bar'] = 2 + 1 + 16 = 19
// ...
Although that wouldn't work because of collisions. I'm not sure how to go about designing a hash function like this. So somehow something else would work where there are never collisions to worry about.
function hash(string, size) {
// return unique integer within size
}
hash('foo', 256) // something like 123
hash('bar', 256) // something like 101
hash('foo', 100) // something else like 50
hash('bar', 100) // something else like 25
I would be interested to know too generally how to go about creating such a function, because it seems very difficult, but not strictly necessary for the question.
Also, looking to do this with basic JavaScript, not any special helper methods or browser-specific stuff.
The set of possible strings is known in advance.
I don't believe what you're looking for is possible unless you know what all 256 strings are ahead of time. Roughly, here's a proof of this:
Suppose there exists some f : S^* → [0, 255] (note: S^* means all finite length strings) s.t. for all 256-length subsets S ⊆ S^*, s_1, s_2 ∈ S, f(s_1) = f(s_2) <=> s_1 = s_2. Since f must not hold any memory of inputs it has seen, it must deterministically map strings to the same number in [0, 255], regardless of what subset this is in.
However, by the Pigeonhole Principle, since there are more than 256 strings, we must have at least two strings that map to the same value between [0, 255]. In particular, this means that if we take a subset S that contains both strings, the above property for f is violated, a contradiction. Thus, f cannot exist.
If you are allowed to know which 256 strings to hash, this is definitely possible. In general, what you're looking for is a perfect hash function.
This link provides an algorithm: https://www.cs.cmu.edu/~avrim/451f11/lectures/lect1004.pdf (refer to pages 56-57)
Quoting:
Method 1: an O(N^2)-space solution
Say we are willing to have a table whose size is quadratic in the size
N of our dictionary S. Then, here is an easy method for
constructing a perfect hash function. Let H be universal and
M=N^2. Then just pick a random h from H and try it out! The
claim is there is at least a 50% chance it will have no collisions.
Method 2: an O(N)-space solution
We will first hash into a table of size N using universal hashing.
This will produce some collisions (unless we are extraordinarily
lucky). However, we will then rehash each bin using Method 1, squaring
the size of the bin to get zero collisions. So, the way to think of
this scheme is that we have a first-level hash function h and
first-level table A, and then N second-level hash functions
h_1, ..., h_N and N second-level tables A_1, ..., A_N. To
lookup an element x, we first compute i=h(x) and then find the
element in A_i[h_i(x)].
Without just doing this: […] which would depend on the order they are inserted.
The set of possible strings is known in advance.
If you're fine with requiring the strings to be known upfront, but you just don't like the arbitrariness of using the order in which they happen to have been inserted, then one simple approach is to gather the strings into an array, sort that array (to get a deterministic ordering), and then use the resulting order:
var stringArray = [];
stringArray.push('foo');
stringArray.push('bar');
stringArray.push('baz');
// ...
stringArray = stringArray.sort();
var strings = {};
for (var i = 0; i < stringArray.length; ++i) {
strings[stringArray[i]] = i;
}
Here is a sketch of an idea that could have good results for a variety of inputs. The code below assumes lowercase english letters, no spaces, and only allows for up to 9 duplicates of any letter.
The idea is that any permutation of length n can be mapped to the integers modulo n by detecting how many times the permutation must be applied to itself before transforming into the identity permutation. Its "power" if you will. The catch is that any permutations with the same permutation cycles (the unordered integer partition that describes them), will result in the same "power", which we are using as the final hash.
To generate the permutation, each letter is assigned to one of nine buckets of 26, depending on if it's a duplicate, and pushed to an array, followed by the missing indexes from 0 to 255.
Like many hash functions, this can result in collisions (which could possibly be ameliorated through a few flags set in the function based on input analysis, although I have yet to consider that more carefully).
function seq(n){
return [...Array(n)].map((_,i) => i);
}
function permute(p1, p){
return p1.map(x => p[x]);
}
function areEqual(p1, p){
for (let i=0; i<p.length; i++)
if (p1[i] != p[i])
return false;
return true;
}
function findPower(p1){
let count = 0;
const p = seq(p1.length);
let p2 = p1.slice();
for (let i=0; i<p.length; i++){
if (!areEqual(p, p2)){
p2 = permute(p2, p1);
count++;
} else {
return count;
}
}
return count;
}
// Returns the permutation based on
// the string, s
function hash(s){
// Each letter is in one of
// 9 buckets of 26, depending
// on if it's a duplicate.
let fs = new Array(26).fill(0);
let result = [];
for (let i=0; i<s.length; i++){
let k = s.charCodeAt(i) - 97;
result.push(26 * fs[k] + k);
fs[k]++;
}
const set = new Set(result);
for (let i=0; i<256; i++)
if (!set.has(i))
result.push(i);
return result;
}
function h(s){
return findPower(hash(s));
}
var strings = [
'foo',
'bar',
'baz',
'hello',
'world',
'etc'];
for (let s of strings)
console.log(`${ s }: ${ h(s) }`);

Algorithm to merge multiple sorted sequences into one sorted sequence in javascript

I am looking for an algorithm to merge multiple sorted sequences, lets say X sorted sequences with n elements, into one sorted sequence in javascript , can you provide some examples?
note: I do not want to use any library.
Trying to solve https://icpc.kattis.com/problems/stacking
what will be the minimal number of operations needed to merge sorted arrays, under conditions :
Split: a single stack can be split into two stacks by lifting any top portion of the stack and putting it aside to form a new stack.
Join: two stacks can be joined by putting one on top of the other. This is allowed only if the bottom plate of the top stack is no larger than the top plate of the bottom stack, that is, the joined stack has to be properly ordered.
History
This problem has been solved for more than a century, going back to Hermann Hollerith and punchcards. Huge sets of punchcards, such as those resulting from a census, were sorted by dividing them into batches, sorting each batch, and then merging the sorted batches--the so-called
"merge sort". Those tape drives you see spinning in 1950's sci-fi movies were most likely merging multiple sorted tapes onto one.
Algorithm
All the algorithms you need can be found at https://en.wikipedia.org/wiki/Merge_algorithm. Writing this in JS is straightforward. More information is available in the question Algorithm for N-way merge. See also this question, which is an almost exact duplicate, although I'm not sure any of the answers are very good.
The naive concat-and-resort approach does not even qualify as an answer to the problem. The somewhat naive take-the-next-minimum-value-from-any-input approach is much better, but not optimal, because it takes more time than necessary to find the next input to take a value from. That is why the best solution using something called a "min-heap" or a "priority queue".
Simple JS solution
Here's a real simple version, which I make no claim to be optimized, other than in the sense of being able to see what it is doing:
const data = [[1, 3, 5], [2, 4]];
// Merge an array or pre-sorted arrays, based on the given sort criteria.
function merge(arrays, sortFunc) {
let result = [], next;
// Add an 'index' property to each array to keep track of where we are in it.
arrays.forEach(array => array.index = 0);
// Find the next array to pull from.
// Just sort the list of arrays by their current value and take the first one.
function findNext() {
return arrays.filter(array => array.index < array.length)
.sort((a, b) => sortFunc(a[a.index], b[b.index]))[0];
}
// This is the heart of the algorithm.
while (next = findNext()) result.push(next[next.index++]);
return result;
}
function arithAscending(a, b) { return a - b; }
console.log(merge(data, arithAscending));
The above code maintains an index property on each input array to remember where we are. The simplistic alternative would be to shift the element from the front of each array when it is its turn to be merged, but that would be rather inefficient.
Optimizing finding the next array to pull from
This naive implementation of findNext, to find the array to pull the next value from, simply sorts the list of inputs by the first element, and takes the first array in the result. You can optimize this by using a "min-heap" to manage the arrays in sorted order, which removes the need to resort them each time. A min-heap is a tree, consisting of nodes, where each node contains a value which is the minimum of all values below, with left and right nodes giving additional (greater) values, and so on. You can find information on a JS implementation of a min-heap here.
A generator solution
It might be slightly cleaner to write this as a generator which takes a list of iterables as inputs, which includes arrays.
// Test data.
const data = [[1, 3, 5], [2, 4]];
// Merge an array or pre-sorted arrays, based on the given sort criteria.
function* merge(iterables, sortFunc) {
let next;
// Create iterators, with "result" property to hold most recent result.
const iterators = iterables.map(iterable => {
const iterator = iterable[Symbol.iterator]();
iterator.result = iterator.next();
return iterator;
});
// Find the next iterator whose value to use.
function findNext() {
return iterators
.filter(iterator => !iterator.result.done)
.reduce((ret, cur) => !ret || cur.result.value < ret.result.value ? cur : ret,
null);
}
// This is the heart of the algorithm.
while (next = findNext()) {
yield next.result.value;
next.result = next.next();
}
}
function arithAscending(a, b) { return a - b; }
console.log(Array.from(merge(data, arithAscending)));
The naive approach is concatenating all the k sequences, and sort the result. But if each sequence has n elements, the the cost will be O(k*n*log(k*n)). Too much!
Instead, you can use a priority queue or heap. Like this:
var sorted = [];
var pq = new MinPriorityQueue(function(a, b) {
return a.number < b.number;
});
var indices = new Array(k).fill(0);
for (var i=0; i<k; ++i) if (sequences[i].length > 0) {
pq.insert({number: sequences[i][0], sequence: i});
}
while (!pq.empty()) {
var min = pq.findAndDeleteMin();
sorted.push(min.number);
++indices[min.sequence];
if (indices[min.sequence] < sequences[i].length) pq.insert({
number: sequences[i][indices[min.sequence]],
sequence: min.sequence
});
}
The priority queue only contains at most k elements simultaneously, one for each sequence. You keep extracting the minimum one, and inserting the following element in that sequence.
With this, the cost will be:
k*n insertions to a heap of k elements: O(k*n)
k*n deletions in a heap of k elements: O(k*n*log(k))
Various constant operations for each number: O(k*n)
So only O(k*n*log(k))
Just add them into one big array and sort it.
You could use a heap, add the first element of each sequence to it, pop the lowest one (that's your first merged element), add the next element from the sequence of the popped element and continue until all sequences are over.
It's much easier to just add them into one big array and sort it, though.
This is a simple javascript algo I came up with. Hope it helps. It will take any number of sorted arrays and do a merge. I am maintaining an array for index of positions of the arrays. It basically iterates through the index positions of each array and checks which one is the minimum. Based on that it picks up the min and inserts into the merged array. Thereafter it increments the position index for that particular array. I feel the time complexity can be improved. Will post back if I come up with a better algo, possibly using a min heap.
function merge() {
var mergedArr = [],pos = [], finished = 0;
for(var i=0; i<arguments.length; i++) {
pos[i] = 0;
}
while(finished != arguments.length) {
var min = null, selected;
for(var i=0; i<arguments.length; i++) {
if(pos[i] != arguments[i].length) {
if(min == null || min > arguments[i][pos[i]]) {
min = arguments[i][pos[i]];
selected = i;
}
}
}
mergedArr.push(arguments[selected][pos[selected]]);
pos[selected]++;
if(pos[selected] == arguments[selected].length) {
finished++;
}
}
return mergedArr;
}
This is a beautiful question. Unlike concatenating the arrays and applying a .sort(); a simple dynamical programming approach with .reduce() would yield a result in O(m.n) time complexity. Where m is the number of arrays and n is their average length.
We will handle the arrays one by one. First we will merge the first two arrays and then we will merge the result with the third array and so on.
function mergeSortedArrays(a){
return a.reduce(function(p,c){
var pc = 0,
cc = 0,
len = p.length < c.length ? p.length : c.length,
res = [];
while (p[pc] !== undefined && c[cc] !== undefined) p[pc] < c[cc] ? res.push(p[pc++])
: res.push(c[cc++]);
return p[pc] === undefined ? res.concat(c.slice(cc))
: res.concat(p.slice(pc));
});
}
var sortedArrays = Array(5).fill().map(_ => Array(~~(Math.random()*5)+5).fill().map(_ => ~~(Math.random()*20)).sort((a,b) => a-b));
sortedComposite = mergeSortedArrays(sortedArrays);
sortedArrays.forEach(a => console.log(JSON.stringify(a)));
console.log(JSON.stringify(sortedComposite));
OK as per #Mirko Vukušić's comparison of this algorithm with .concat() and .sort(), this algorithm is still the fastest solution with FF but not with Chrome. The Chrome .sort() is actually very fast and i can not make sure about it's time complexity. I just needed to tune it up a little for JS performance without touching the essence of the algorithm at all. So now it seems to be faster than FF's concat and sort.
function mergeSortedArrays(a){
return a.reduce(function(p,c){
var pc = 0,
pl =p.length,
cc = 0,
cl = c.length,
res = [];
while (pc < pl && cc < cl) p[pc] < c[cc] ? res.push(p[pc++])
: res.push(c[cc++]);
if (cc < cl) while (cc < cl) res.push(c[cc++]);
else while (pc < pl) res.push(p[pc++]);
return res;
});
}
function concatAndSort(a){
return a.reduce((p,c) => p.concat(c))
.sort((a,b) => a-b);
}
var sortedArrays = Array(5000).fill().map(_ => Array(~~(Math.random()*5)+5).fill().map(_ => ~~(Math.random()*20)).sort((a,b) => a-b));
console.time("merge");
mergeSorted = mergeSortedArrays(sortedArrays);
console.timeEnd("merge");
console.time("concat");
concatSorted = concatAndSort(sortedArrays);
console.timeEnd("concat");
5000 random sorted arrays of random lengths between 5-10.
es6 syntax:
function mergeAndSort(arrays) {
return [].concat(...arrays).sort()
}
function receives array of arrays to merge and sort.
*EDIT: as cought by #Redu, above code is incorrect. Default sort() if sorting function is not provided, is string Unicode. Fixed (and slower) code is:
function mergeAndSort(arrays) {
return [].concat(...arrays).sort((a,b)=>a-b)
}

Array slow elements removal

I have array and I want to remove N elements from its head.
Lets say, array (with floats) has 1M elements and I want first 500K to go out. I have two ways, call shift 500K times in loop or call splice(0,500000).
The thing is, first solution is horrible idea (it's very very slow). Second is slow too because splice returns removed part from array in a new array (well, it just allocate 500K floats and throw them out of window).
In my app, I'm doing some things with really big matrices, and unfortunately, elements removal via splice is slow for me. Is there some faster way how to achieve it?
I would expect that Array#slice would be at least as fast as either of those options and probably faster. It does mean temporarily allocating duplicated memory, but 1M numbers is only about 64MB of memory (assuming the JavaScript engine has been able to use a true array under the covers), so temporarily having the original 64MB plus the 32MB for the ones you want to keep before releasing the original 64MB seems fairly cheap:
array = array.slice(500000);
This also has the advantage that it won't force the JavaScript engine into using an object rather than an array under the covers. (Other things you're doing may cause that, but...)
You've said you're doing this with floats, you might look at using Float64Array rather than untyped arrays. That limits the operations you can perform, but ensures that you don't end up with unoptimized arrays. When you delete entries from arrays, you can end up with unoptimized arrays with markedly slower access times than optimized arrays, as they end up being objects with named properties rather than offset accesses. (A good JavaScript engine will keep them optimized if it can; using typed arrays would help prevent you from blowing its optimizations.)
This (dashed off and quite certainly flawed) NodeJS test suggests that splice is anywhere from 60% to 95% slower than slice, and that V8 does a great job keeping the array optimized as the result for the typed array is virtually identical to the result for the untyped array in the slice case:
"use strict";
let sliceStats = createStats();
let sliceTypedStats = createStats();
let spliceStats = createStats();
for (let c = 0; c < 100; ++c) {
if (test(buildUntyped, sliceStats, testSlice).length != 500000) throw new Error("1");
if (test(buildTyped, sliceTypedStats, testSlice).length != 500000) throw new Error("2");
if (test(buildUntyped, spliceStats, testSplice).length != 500000) throw new Error("3");
console.log(c);
}
console.log("slice ", avg(sliceStats.sum, sliceStats.count));
console.log("sliceTyped", avg(sliceTypedStats.sum, sliceTypedStats.count));
console.log("splice ", avg(spliceStats.sum, spliceStats.count));
function avg(sum, count) {
return (sum / count).toFixed(3);
}
function createStats() {
return {
count: 0,
sum: 0
};
}
function buildUntyped() {
let a = [];
for (let n = 0; n < 1000000; ++n) {
a[n] = Math.random();
}
return a;
}
function buildTyped() {
let a = new Float64Array(1000000);
for (let n = 0; n < 1000000; ++n) {
a[n] = Math.random();
}
return a;
}
function test(build, stats, f) {
let a;
let ignore = 0;
let start = Date.now();
for (let i = 0; i < 10; ++i) {
let s = Date.now();
a = build();
ignore += Date.now() - s;
a = f(a);
}
stats.sum += Date.now() - start - ignore;
++stats.count;
return a;
}
function testSlice(a) {
return a.slice(500000);
}
function testSplice(a) {
a.splice(0, 500000);
return a;
}
Immutable.js solves this problem by structural sharing. It does not copy the entries as splice would do but returns a reference on the included
parts of the array. You would need to move your array to the Immutable.js data structure and then call the immutable operation splice.

How to find all combinations of elements in JavaScript array

I have the following array:
[[A,1,X],[B,2,Y],[C,3,Z]]
I want to be able to get all combinations of the first index of each sub array and then loop through those combinations performing a single task on each. So these are the combinations I'm after (Note I need the combination of the same value as well):
[[A,A],[A,B],[A,C],[B,A],[B,B],[B,C],[C,A],[C,B],[C,C]]
I'd then loop through that and do something with each of the values.
I'm not sure where even to start here so any advice or pointers would be really helpful!
You need to effectively loop through the array twice. Based on what you want you can just statically access the first element each time:
var arr = [['A',1,'X'],['B',2,'Y'],['C',3,'Z']];
var newArr = [];
var length = arr.length;
var curr;
for (var i = 0; i < length; i++) {
curr = arr[i][0];
for (var j = 0; j < length; j++) {
newArr.push([curr, arr[j][0]]);
}
}
console.log(newArr);
Fiddle
Try this:
var data = [['A',1,'X'],['B',2,'Y'],['C',3,'Z']];
function getCombinations(data) {
var combinations = [];
data.forEach(function(first) {
data.forEach(function(second) {
combinations.push([first[0], second[0]]);
});
});
return combinations;
}
console.log(getCombinations(data));
Here is the jsfiddle-demo
Let's decompose the problem. First, let's get extracting the first element of each subarray out of the way:
function get_elts(data, idx) {
return data.map(function(v) { return v[idx]; });
}
So
> get_elts(data, 0) // ['A', 'B', 'C']
Decomposing the problem like this is fundamental to good program design. We don't want to write things which jumble up multiple problems. In this case, the multiple problems are (1) getting the first element of each subarray and (2) finding the combinations. If we write one routine which mixes up the two problems, then we will never be able to re-use it for other things. If our boss comes and says now he wants to find all the combinations of the second element of each subarray, we'll have to cut and paste and create nearly duplicate code. Then we'll be maintaining that code for the rest of our lives or at least until we quit. The rule about factoring is do it sooner rather than later.
Then, create all combinations of any two arrays:
function combinations(arr1, arr2) { //create all combos of elts in 2 arrays by
return [].concat.apply( //concatenating and flattening
[], //(starting with an empty array)
arr1.map( //a list created from arr1
function(v1) { //by taking each elt and from it
return arr2.map( //creating a list from arr2
function(v2) { //by taking each element and from it
return [v1, v2]; //making a pair with the first elt
}
);
};
)
);
}
Normally we would write this more compactly. Let's walk through it:
Array#concat combines one or more things, or elements inside those things if they are arrays, into an array.
Function#apply lets us provide an array that will turn into the argument list of concat.
Array#map creates a parallel array to arr1, which contains...
elements which are two-element arrays based on looping over arr2.
Right, this is not your mother's JavaScript. It's almost a different language from the style where you initialize this and set that and loop over the other thing and return something else. By adopting this style, we end up with code which is more precise, concise, reusable, provably correct, future-friendly, and maybe optimizable.
By future-friendly, I mean among other things ES6-friendly. The above could be rewritten as:
combinations = (arr1, arr2) => [].concat(...arr1.map(v1 => arr2.map(v2 => [v1, v2])));
Get ready guys and girls, this will come up in your job interviews pretty soon now. Time to move on from jQuery.
Now the problem can be expressed as:
var first_elts = get_elts(data, 0);
combinations(first_elts, first_elts);

How would you keep a Math.random() in javascript from picking the same numbers multiple times?

I have an array var words = []//lots of different words in it. I have a Math.floor(Math.random()*words.length) that chooses a random word from the array. This is run in a loop that runs for a random number of times (between 2 and 200 times). I would like to make sure that the random numbers do not get chosen more than once during the time that that loop runs. How would you suggest doing this?
There's multiple ways of doing this.
You can shuffle the entire collection, and just grab items from one end. This will ensure you won't encounter any one item more than once (or rather, more than the number of times it occured in the original input array) during one whole iteration.
This, however, requires you to either modify in-place the original collection, or to create a copy of it.
If you only intend to grab a few items, there might be a different way.
You can use a hash table or other type of dictionary, and just do a check if the item you picked at random in the original collection already exists in the dictionary. If it doesn't, add it to the dictionary and use it. If it already exists in the dictionary, pick again.
This approach uses storage proportional to the number of items you need to pick.
Also note that this second approach is a bit bad performance-wise when you get to the few last items in the list, as you can risk hunting for the items you still haven't picked for quite a number of iterations, so this is only a viable solution if the items you need to randomly pick are far fewer than the number of items in the collection.
There are several different approaches, which are more or less effective depending on how much data you have, and how many items you want to pick:
Remove items from the array once they have been picked.
Shuffle the array and get the first items.
Keep a list of picked items and compare against new picks.
Loop through the items and pick values randomly based on the probability to be picked.
I'd shuffle the array as follows and then iterate over the shuffled array. There's no expensive array splice calls and the shuffling routine consists of swapping two values in the array n times where n is the length of the array, so it should scale well:
function shuffle(arr) {
var shuffled = arr.slice(0), i = arr.length, temp, index;
while (i--) {
index = Math.floor(i * Math.random());
temp = shuffled[index];
shuffled[index] = shuffled[i];
shuffled[i] = temp;
}
return shuffled;
}
console.log(shuffle(["one", "two", "three", "four"]));
this is how you can do it without shuffling the whole array
a = "abcdefghijklmnopq".split("");
c = 0;
r = [];
do {
var len = a.length - c;
var rnd = Math.floor(Math.random() * len);
r.push(a[rnd]);
a[rnd] = a[len - 1];
} while(++c < 5);
console.log(r);
the idea is to pick from 0..(length - step) elements and then shift the picked element towards the end.
I'd try using a map (Object literal) instead of an Array with keys being indexes:
var words = [ /* ... */ ] , map = { } , length = words.length ;
for(var n = 0 ; n < length ; n++) map[n] = words[n] ;
then make a function to pick a random entry based on the length, delete the entry (hence the index) and adjust the length:
function pickRandomEntry() {
var random = Math.floor( Math.random() * length ) ;
var entry = map[random] ;
delete map[random] ;
length-- ;
return entry ;
}
with this approach, you have to check for an undefined return value (since random might return the same number) and run the function again until it returns an actual value; or, make an array of picked indexes to filter the random numbers (which will however slow performance in case of long iteration cycles).
HTH
There are several solutions to this.
What you could do is use .splice() on your array to remove the item which is hit by words.
Then you can iterate over your array until it's empty. If you need to keep the array pristine you can create a copy of it first and iterate over the copy.
var words = ['apple', 'banana', 'cocoa', 'dade', 'elephant'];
while (words.length > 0) {
var i = Math.floor(Math.random()*words.length);
var word = words.splice(i, 1)[0];
}
Or something to that effect.
Here is a way with prime numbers and modulo that seems to do the trick without moving the original array or adding a hash:
<html>
<head>
</head>
<body>
shuffle
<div id="res"></div>
<script>
function shuffle(words){
var l = words.length,i = 1,primes = [43,47,53,59,61,67,71,73,79],//add more if needed
prime = primes[parseInt(Math.random()*primes.length, 10)],
temp = [];
do{
temp.push((i * prime) % l);
}while(++i <= l);
console.log(temp.join(','));
console.log(temp.sort().join(','));
}
</script>
</body>
</html>

Categories

Resources