I have a long-running function that does a huge calculation: all the possible permutations of x n-sided dice and the probability of those outcomes. For small x and n, the calculation is quick. For large values (n = 100, x > 3), the calculation takes tens of seconds if not longer; meanwhile, the browser stalls.
Here's a snippet of my code:
let dist = [];
// min and max are the minimum and maximum possible values
// (if dice are all 1's or all their maximum values)
for (let i = min; i <= max; i++) {
// initialize possible values of our distribution to 0
dist.push([ i, 0 ]);
}
// total is the total outcome value so far
// dIndex is the index into the dice-array (diceList) for the die we're
// currently applying to our total--the die we're "rolling"
function applyDie(total, dIndex) {
if (dIndex === diceList.length) {
dist[total - min][1]++;
permutationsCount++;
return;
}
// diceList is an array of integers representing the number of sides
// for each die (one array element = one die of element-value sides)
for (let i = 1; i <= diceList[dIndex]; i++) {
applyDie(total + i, dIndex + 1);
}
}
// kick off recursive call
applyDie(0, 0);
I want to add two functionalities:
Cancellation
Progress reporting
Cancellation will be easy (I can do it myself) once I have the async pattern in place, so I really only need help with progress reporting, or, more accurately, simply breaking the recursive pattern into chunks based on the permutationsCount variable. i.e.
/* ... */
permutationsCount++;
if (permutationsCount % chunkSize === 0)
/* end this chunk and start a new one */
I would prefer to use Javasciprt Promises, but I'm open to other suggestions.
Ideas?
Here's a function I wrote to do something similar. It's for a calculation done entirely in javascript... I couldn't tell from your question whether you were working entirely on the client side or what.
// Break the list up into equal-sized chunks, applying f() to each item of
// the list, writing a %-complete value to the progress span after each
// chunk. Then execute a callback with the resulting data.
var chunkedLoop = function (list, f, progressSpan, callback) {
var numChunks = 10,
chunkSize = Math.round(list.length / numChunks),
chunkedList = [], // will be a list of lists
// Concatenated results of all chunks, passed to the callback.
resultList = [],
x, // just a loop variable
chunkIndex = 0; // tracks the current chunk across timeouts
progressSpan.html(0);
// Splice of chunks of equal size, but allow the last one to be of an
// arbitrary size, in case numChunks doesn't divide the length of the
// list evenly.
for (x = 0; x < numChunks - 1; x += 1) {
chunkedList.push(list.splice(0, chunkSize));
}
chunkedList.push(list);
// Schedule a series of timeouts, one for each chunk. The browser
// stops blocking for a moment between each chunk, allowing the screen
// to update. This is the only way to have progress values reported to
// the view mid-loop. If it was done in a single loop, execution would
// block all the way to the end, and the screen would only update once
// at 100%.
var chunkFunction = function () {
setTimeout(function () {
// Run f on the chunk.
var chunk = chunkedList[chunkIndex];
var r = forEach(chunk, f);
resultList = resultList.concat(r);
chunkIndex += 1;
// Update progress on the screen.
progressSpan.html(Math.round(chunkIndex / numChunks * 100));
// Schedule the next run, if this isn't the last chunk. If it
// is the last chunk, execute the callback with the results.
if (chunkIndex < chunkedList.length) {
chunkFunction();
} else if (callback instanceof Function) {
callback.call(undefined, resultList);
}
// There's no reason to delay more than the minimum one
// millisecond, since the point is just to break up javascript's
// single-threaded blocking.
}, 1);
};
chunkFunction();
};
For reporting status you can pass callback function into your recursuve function and do there whatever you like (increase counter, push status into page etc).
Also think about rewriting recursive into iterative algorithm because it will have lesser memory overhead and it will be much easier to put there some other logic (like cancellations you mentioned)
You could use setTimeout to let JavaScript do other things and unstuck the event loop. This way even infinite loop would nonblocking. Here is a quick dirty example.
http://jsfiddle.net/xx5adut6/
function isPrime(n) {
// If n is less than 2 or not an integer then by definition cannot be prime.
if (n < 2) {
return false
}
if (n != Math.round(n)) {
return false
}
var isPrime = true;
for (var i = 2; i <= Math.sqrt(n); i++) {
if (n % i == 0) {
isPrime = false
}
}
// Finally return whether n is prime or not.
return isPrime;
}
var cancel = false;
var i = 0;
var primesFound = 0;
var status = $('.status');
var timeout;
function loop4Primes() {
if (cancel) {
clearTimeout(timeout);
return;
}
if (isPrime(i++)) primesFound++;
timeout = setTimeout(loop4Primes, 1);
}
function updateText() {
status.text(primesFound);
}
var interval = setInterval(updateText, 1000);
$('#start').click(function () {
loop4Primes();
$(this).off('click');
});
$('#stop').click(function () {
clearInterval(interval);
updateText();
cancel = true;
});
Related
I'm doing some javascript challenges on codeschool.com.
Whilst at the function expressions chapter, I came across the following weird bug.
In the following script - which can be heavily refactored, I know - the 'queue' variable gets assigned a value of 'undefined' after the second run.
First iteration:
queue array printed, containing 4 elements
"hello 1" printed
Second iteration:
queue array printed, containing 3 elements
"hello 3" printed
queue array printed, containing 2 elements
Third iteration - where the bug(?) occurs:
undefined is printed - instead of the queue array which should still contain 2 elements
First I thought something was wrong at codeschool's end, but after trying to run the code in chrome's dev tools I got the same results.
As far as I can tell nothing asynchronous should be happening which could mess anything up?
var puzzlers = [
function(a) { return 8 * a - 10; },
function(a) { return (a - 3) * (a - 3) * (a - 3); },
function(a) { return a * a + 4; },
function(a) { return a % 5; }
];
var start = 2;
var applyAndEmpty = function(input, queue) {
for (var i = 0; i < queue.length; i++) {
alert(queue);
if (i === 0) {
alert("hello 1");
var output = queue.shift()(input);
} else if (queue.length === 1) {
alert("hello 2");
return queue.shift()(output);
} else {
alert("hello 3");
output = queue.shift()(output);
alert(queue);
}
}
};
alert(applyAndEmpty(start, puzzlers));
Thanks!
Code Review
There's a couple bad things happening in this code; we should go over them first. For convenience's sake, I'm going to replace all alert calls with console.log calls – The console is much nicer for debugging things.
OK the first problem is you're using queue.shift() inside a for loop. Array.prototype.shift will change the length of the array, so you're not meant to use this inside a loop (outside of a very specialized case).
So each time we loop, i goes up one and queue.length goes down one – the two values are converging on each other which means you'll never actually touch all the values in queue
Refactor
We can fix that with a very simple adjustment of your function – remove the for loop! queue.shift() is effectively incrementing for us, but removing one element at a time. All we need to do is check if queue.length is empty – if so, we're done, otherwise shift one item off the queue and repeat
var applyAndEmpty = function(input, queue) {
console.log(input)
if (queue.length === 0)
return input;
else
return applyAndEmpty(queue.shift()(input), queue)
}
var puzzlers = [
function(a) { return 8 * a - 10; },
function(a) { return (a - 3) * (a - 3) * (a - 3); },
function(a) { return a * a + 4; },
function(a) { return a % 5; }
];
var start = 2;
console.log(applyAndEmpty(start, puzzlers));
// [initial input] 2
// 8 * 2 - 10 = 6
// (6 - 3) * (6 - 3) * (6 - 3) = 27
// 27 * 27 + 4 = 733
// 733 % 5 = 3
// [final output] = 3
console.log(puzzlers); // []
Greater lesson
There's one important thing to know about stepping thru arrays manually. You do ONE of the options below
1. check that the array is empty before attempting to get a value
// there are no items so we cannot get one
if (arr.length === 0)
doSomething;
// we know there is at least one item, so it is safe to get one
else
doOtherThing( arr.shift() )
2. null-check the thing you attempted to get
x = queue.shift();
// we didn't get a value, queue must've been empty
if (x === undefined)
doSomething
// yay we got something, we can use x now
else
doOtherThing( x )
It's your personal choice to use one over the other, tho I generally dislike null-checks of this kind. I believe it's better to check that the array has the value before attempting to grab it.
Doing BOTH of these options would be a logical error. If we check the array has a length of non-0, we can deduce we have an x – therefore we don't have to null-check for an x.
Remarks
I don't know if this was for an assignment or whatever, but you basically wrote up function composition from scratch – tho this is a kind of special destructive composition.
It seems you're aware of the destructive nature because you've named your function applyAndEmpty, but just in case you didn't know that was necessary, I'll share a non-destructive example below.
var lcompose = function(fs) {
return function (x) {
if (fs.length === 0)
return x
else
return lcompose(fs.slice(1))(fs[0](x))
}
}
var puzzlers = [
function(a) { return 8 * a - 10; },
function(a) { return (a - 3) * (a - 3) * (a - 3); },
function(a) { return a * a + 4; },
function(a) { return a % 5; }
];
var start = 2;
console.log(lcompose(puzzlers)(start)) // 3
That undefined you're getting is coming from this alert() call:
alert(applyAndEmpty(start, puzzlers));
At the end of the second iteration, i will be 1 and the queue length will be 2. Thus i is incremented for the next iteration, and the loop condition no longer holds — i is 2, and no longer less than the queue length.
It's a good habit when debugging to include some messaging instead of just values so that you can tell one debug output line from another. It's also a good idea to use console.log() instead of alert().
Assume the below solves your problem.
You used queue.length in second condition, at that stage, you perform only one shift and expecting a 4 size queue to be of 2 size, it is one error.
var puzzlers = [
function(a) { return 8 * a - 10; },
function(a) { return (a - 3) * (a - 3) * (a - 3); },
function(a) { return a * a + 4; },
function(a) { return a % 5; }
];
var start = 2;
var applyAndEmpty = function(input, queue) {
for (var i = 0; i < queue.length; i++) {
if (i === 0) {
alert("hello 1");
var output = queue.shift()(input);
} else if (i === 1) {
alert("hello 2");
return queue.shift()(output);
} else {
alert("hello 3");
output = queue.shift()(output);
}
}
};
alert(applyAndEmpty(start, puzzlers));
Why isn't this working? I'm getting a stack too deep error:
var countRecursion = function(array) {
var sum = 0
var count = 0
sum += array[count]
count ++
if (count < array.length) {
countRecursion(array);
} else {
return sum
}
}
You made a mistake and reset sum and counter inside the recursive block. I simply moved them outside.
var countRecursion = function(array) {
sum += array[count]
count ++
if (count < array.length) {
countRecursion(array);
} else {
return sum
}
}
var sum = 0
var count = 0
countRecursion([1,2,3]);
alert(sum);
This code is not recursive but iterative. I am not 100% sure if that's what you really wanted. But since you mentioned it, some people down voted my answer since I only fixed your code, but didn't made it recursive I guess. For completeness, here is recursive version of your code:
var countRecursion = function(array, ind) {
if (ind < array.length) {
return array[ind] + countRecursion(array, ind + 1);
} else {
return 0;
}
}
var sum = 0
var count = 0
sum = sum + countRecursion([1,2,3, 5, 6, 7], count);
alert(sum);
For recursion: pass data up, return data down.
The original code has a different count variable, being a local variable defined in the function, that is initial set to 0. As such the base case is never reached and the function recurses until the exception is thrown.
In addition to using a variable from an outer scope (or another side-effect) this can also be addressed by by following the recommendation on how to handle recursion, eg.
var countRecursion = function(array, index) {
index = index || 0; // Default to 0 when not specified
if (index >= array.length) {
// Base case
return 0;
}
// Recurrence case - add the result to the sum of the current item.
// The recursive function is supplied the next index so it will eventually terminate.
return array[index] + countRecursion(array, index + 1);
}
I see what you're thinking.
The issue with your code is that everytime you call countRecursion, count goes back to 0 (since it's initialized to 0 within your function body). This is making countRecursion execute infinitely many times, as you're always coming back to count = 0 and checking the first term. You can solve this by either:
Initializing count outside the function body, that way when you do count++, it increases and doesn't get reset to 0.
Passing count along with array as a parameter. That way, the first time you call the function, you say countRecursion(array, 0) to initialize count for you.
Note that you have to do the same for sum, else that will also revert to zero always.
Finally, (and this doesn't have to do with the stack error) you have to actually call return countRecursion(array) to actually move up the stack (at least that's how it is in C++ and what not - pretty sure it applies to javascript too though).
Array sum using recursive method
var countRecursion = function(arr, current_index) {
if(arr.length === current_index) return 0;
current_index = current_index || 0;
return countRecursion(arr, current_index+1) + arr[current_index];
}
document.body.innerHTML = countRecursion([1,2,3,4,5, 6])
Due to the recursive nature, I've been able to activate an lstm, which has only 1 input-neuron, with a sequence by inputting one item at a time.
However, when I attempt to train the network with the same technique, it never converges. The training goes on forever.
Here's what I'm doing, I'm converting a natural-language string to binary and then feeding one digit as a time. The reason I am converting into binary is because the network only takes values between 0 and 1.
I know the training works because when I train with an array of as many values as the input-neurons, in this case 1 so: [0], it converges and trains fine.
I guess I could pass each digit individually, but then it would have an individual ideal-output for each digit. And when the digit appears again with another ideal-output in another training set, it won't converge because how could for example 0 be of class 0 and 1?
Please tell me if I am wrong on this assumption.
How can I train this lstm with a sequence so that similar squences are classified similarly when activated?
Here is my whole trainer-file: https://github.com/theirf/synaptic/blob/master/src/trainer.js
Here is the code that trains the network on a worker:
workerTrain: function(set, callback, options) {
var that = this;
var error = 1;
var iterations = bucketSize = 0;
var input, output, target, currentRate;
var length = set.length;
var start = Date.now();
if (options) {
if (options.shuffle) {
function shuffle(o) { //v1.0
for (var j, x, i = o.length; i; j = Math.floor(Math.random() *
i), x = o[--i], o[i] = o[j], o[j] = x);
return o;
};
}
if(options.iterations) this.iterations = options.iterations;
if(options.error) this.error = options.error;
if(options.rate) this.rate = options.rate;
if(options.cost) this.cost = options.cost;
if(options.schedule) this.schedule = options.schedule;
if (options.customLog){
// for backward compatibility with code that used customLog
console.log('Deprecated: use schedule instead of customLog')
this.schedule = options.customLog;
}
}
// dynamic learning rate
currentRate = this.rate;
if(Array.isArray(this.rate)) {
bucketSize = Math.floor(this.iterations / this.rate.length);
}
// create a worker
var worker = this.network.worker();
// activate the network
function activateWorker(input)
{
worker.postMessage({
action: "activate",
input: input,
memoryBuffer: that.network.optimized.memory
}, [that.network.optimized.memory.buffer]);
}
// backpropagate the network
function propagateWorker(target){
if(bucketSize > 0) {
var currentBucket = Math.floor(iterations / bucketSize);
currentRate = this.rate[currentBucket];
}
worker.postMessage({
action: "propagate",
target: target,
rate: currentRate,
memoryBuffer: that.network.optimized.memory
}, [that.network.optimized.memory.buffer]);
}
// train the worker
worker.onmessage = function(e){
// give control of the memory back to the network
that.network.optimized.ownership(e.data.memoryBuffer);
if(e.data.action == "propagate"){
if(index >= length){
index = 0;
iterations++;
error /= set.length;
// log
if(options){
if(this.schedule && this.schedule.every && iterations % this.schedule.every == 0)
abort_training = this.schedule.do({
error: error,
iterations: iterations
});
else if(options.log && iterations % options.log == 0){
console.log('iterations', iterations, 'error', error);
};
if(options.shuffle) shuffle(set);
}
if(!abort_training && iterations < that.iterations && error > that.error){
activateWorker(set[index].input);
}
else{
// callback
callback({
error: error,
iterations: iterations,
time: Date.now() - start
})
}
error = 0;
}
else{
activateWorker(set[index].input);
}
}
if(e.data.action == "activate"){
error += that.cost(set[index].output, e.data.output);
propagateWorker(set[index].output);
index++;
}
}
A natural language string should not be converted to binary to normalize. Use one-hot encoding instead:
Additionally, I advise you to take a look at Neataptic instead of Synaptic. It fixed a lot of bugs in Synaptic and has more functions for you to use. It has a special option during training, called clear. This tells the network to reset the context every training iteration, so it knows it is starting from the beginning.
Why does your network only have 1 binary input? The networks inputs should make sense. Neural networks are powerful, but you are giving them a very hard task.
Instead you should have multiple inputs, one for each letter. Or even more ideally, one for each word.
I have a for-loop from 0 to 8,019,000,000 that is extremely slow.
var totalCalcs = 0;
for (var i = 0; i < 8019000000; i++)
totalCalcs++;
window.alert(totalCalcs);
in chrome this takes between 30-60secs.
I also already tried variations like:
var totalCalcs = 0;
for (var i = 8019000000; i--; )
totalCalcs++;
window.alert(totalCalcs);
Didn't do too much difference unfortunately.
Is there anything I can do to speed this up?
Your example is rather trivial, and any answer may not be suitable for whatever code you're actually putting inside of a loop with that many iterations.
If your work can be done in parallel, then we can divide the work between several web workers.
You can read a nice introduction to web workers, and learn how to use them, here:
http://www.html5rocks.com/en/tutorials/workers/basics/
Figuring out how to divide the work is a challenge that depends entirely on what that work is. Because your example is so small, it's easy to divide the work among inline web workers; here is a function to create a worker that will invoke a function asynchronously:
var makeWorker = function (fn, args, callback) {
var fnString = 'self.addEventListener("message", function (e) {self.postMessage((' + fn.toString() + ').apply(this, e.data))});',
blob = new Blob([fnString], { type: 'text/javascript' }),
url = URL.createObjectURL(blob),
worker = new Worker(url);
worker.postMessage(args);
worker.addEventListener('message', function (e) {
URL.revokeObjectURL(url);
callback(e.data);
});
return worker;
};
The work that we want done is adding numbers, so here is a function to do that:
var calculateSubTotal = function (count) {
var sum = 0;
for (var i = 0; i < count; ++i) {
sum++;
}
return sum;
};
And when a worker finishes, we want to add his sum to the total AND tell us the result when all workers are finished, so here is our callback:
var total = 0, count = 0, numWorkers = 1,
workerFinished = function (subTotal) {
total += subTotal;
count++;
if (count == numWorkers) {
console.log(total);
}
};
And finally we can create a worker:
makeWorker(calculateSubTotal, [10], workerFinished); // logs `10` to console
When put together, these pieces can calculate your large sum quickly (depending on how many CPUs your computer has, of course).
I have a complete example on jsfiddle.
Treating your question as a more generic question about speeding up loops with many iterations: you could try Duff's device.
In a test using nodejs the following code decreased the loop time from 108 seconds for your second loop (i--) to 27 seconds
var testVal = 0, iterations = 8019000000;
var n = iterations % 8;
while (n--) {
testVal++;
}
n = parseInt(iterations / 8);
while (n--) {
testVal++;
testVal++;
testVal++;
testVal++;
testVal++;
testVal++;
testVal++;
testVal++;
}
I've got the following code inside a <script> tag on a webpage with nothing else on it. I'm afraid I do not presently have it online. As you can see, it adds up all primes under two million, in two different ways, and calculates how long it took on average. The variable howOften is used to do this a number of times so you can average it out. What puzzles me is, for howOften == 1, method 2 is faster, but for howOften == 10, method 1 is. The difference is significant and holds even if you hit F5 a couple of times.
My question is simply: how come?
(This post has been edited to incorporate alf's suggestion. But that made no difference! I'm very much puzzled now.)
(Edited again: with howOften at or over 1000, the times seem stable. Alf's answer seems correct.)
function methodOne(maxN) {
var sum, primes_inv, i, j;
sum = 0;
primes_inv = [];
for ( var i = 2; i < maxN; ++i ) {
if ( primes_inv[i] == undefined ) {
sum += i;
for ( var j = i; j < maxN; j += i ) {
primes_inv[j] = true;
}
}
}
return sum;
}
function methodTwo(maxN) {
var i, j, p, sum, ps, n;
n = ((maxN - 2) / 2);
sum = n * (n + 2);
ps = [];
for(i = 1; i <= n; i++) {
for(j = i; j <= n; j++) {
p = i + j + 2 * i * j;
if(p <= n) {
if(ps[p] == undefined) {
sum -= p * 2 + 1;
ps[p] = true;
}
}
else {
break;
}
}
}
return sum + 2;
}
// ---------- parameters
var howOften = 10;
var maxN = 10000;
console.log('iterations: ', howOften);
console.log('maxN: ', maxN);
// ---------- dry runs for warm-up
for( i = 0; i < 1000; i++ ) {
sum = methodOne(maxN);
sum = methodTwo(maxN);
}
// ---------- method one
var start = (new Date).getTime();
for( i = 0; i < howOften; i++ )
sum = methodOne(maxN);
var stop = (new Date).getTime();
console.log('methodOne: ', (stop - start) / howOften);
// ---------- method two
for( i = 0; i < howOften; i++ )
sum = methodTwo(maxN);
var stop2 = (new Date).getTime();
console.log('methodTwo: ', (stop2 - stop) / howOften);
Well, JS runtime is an optimized JIT compiler. Which means that for a while, your code is interpreted (tint), after that, it gets compiled (tjit), and finally you run a compiled code (trun).
Now what you calculate is most probably (tint+tjit+trun)/N. Given that the only part depending almost-linearly on N is trun, this comparison soes not make much sense, unfortunately.
So the answer is, I don't know. To have a proper answer,
Extract the code you are trying to profile into functions
Run warm-up cycles on these functions, and do not use timing from the warm-up cycles
Run much more than 1..10 times, both for warm-up and measurement
Try swapping the order in which you measure time for algorithms
Get into JS interpretator internals if you can and make sure you understand what happens: do you really measure what you think you do? Is JIT run during the warm-up cycles and not while you measure? Etc., etc.
Update: note also that for 1 cycle, you get run time less than the resolution of the system timer, which means the mistake is probably bigger than the actual values you compare.
methodTwo simply requires that the processor execute fewer calculations. In methodOne your initial for loop is executing maxN times. In methodTwo your initial for loop is executing (maxN -2)/2 times. So in the second method the processor is doing less than half the number of calculations that the first method is doing. This is compounded by the fact that each method contains a nested for loop. So big-O of methodOne is maxN^2. Whereas big-O of methodTwo is ((maxN -2)/2)^2.