I'm trying to devise a (good) way to choose a random number from a range of possible numbers where each number in the range is given a weight. To put it simply: given the range of numbers (0,1,2) choose a number where 0 has an 80% probability of being selected, 1 has a 10% chance and 2 has a 10% chance.
It's been about 8 years since my college stats class, so you can imagine the proper formula for this escapes me at the moment.
Here's the 'cheap and dirty' method that I came up with. This solution uses ColdFusion. Yours may use whatever language you'd like. I'm a programmer, I think I can handle porting it. Ultimately my solution needs to be in Groovy - I wrote this one in ColdFusion because it's easy to quickly write/test in CF.
public function weightedRandom( Struct options ) {
var tempArr = [];
for( var o in arguments.options )
{
var weight = arguments.options[ o ] * 10;
for ( var i = 1; i<= weight; i++ )
{
arrayAppend( tempArr, o );
}
}
return tempArr[ randRange( 1, arrayLen( tempArr ) ) ];
}
// test it
opts = { 0=.8, 1=.1, 2=.1 };
for( x = 1; x<=10; x++ )
{
writeDump( weightedRandom( opts ) );
}
I'm looking for better solutions, please suggest improvements or alternatives.
Rejection sampling (such as in your solution) is the first thing that comes to mind, whereby you build a lookup table with elements populated by their weight distribution, then pick a random location in the table and return it. As an implementation choice, I would make a higher order function which takes a spec and returns a function which returns values based on the distribution in the spec, this way you avoid having to build the table for each call. The downsides are that the algorithmic performance of building the table is linear by the number of items and there could potentially be a lot of memory usage for large specs (or those with members with very small or precise weights, e.g. {0:0.99999, 1:0.00001}). The upside is that picking a value has constant time, which might be desirable if performance is critical. In JavaScript:
function weightedRand(spec) {
var i, j, table=[];
for (i in spec) {
// The constant 10 below should be computed based on the
// weights in the spec for a correct and optimal table size.
// E.g. the spec {0:0.999, 1:0.001} will break this impl.
for (j=0; j<spec[i]*10; j++) {
table.push(i);
}
}
return function() {
return table[Math.floor(Math.random() * table.length)];
}
}
var rand012 = weightedRand({0:0.8, 1:0.1, 2:0.1});
rand012(); // random in distribution...
Another strategy is to pick a random number in [0,1) and iterate over the weight specification summing the weights, if the random number is less than the sum then return the associated value. Of course, this assumes that the weights sum to one. This solution has no up-front costs but has average algorithmic performance linear by the number of entries in the spec. For example, in JavaScript:
function weightedRand2(spec) {
var i, sum=0, r=Math.random();
for (i in spec) {
sum += spec[i];
if (r <= sum) return i;
}
}
weightedRand2({0:0.8, 1:0.1, 2:0.1}); // random in distribution...
Generate a random number R between 0 and 1.
If R in [0, 0.1) -> 1
If R in [0.1, 0.2) -> 2
If R in [0.2, 1] -> 3
If you can't directly get a number between 0 and 1, generate a number in a range that will produce as much precision as you want. For example, if you have the weights for
(1, 83.7%) and (2, 16.3%), roll a number from 1 to 1000. 1-837 is a 1. 838-1000 is 2.
I use the following
function weightedRandom(min, max) {
return Math.round(max / (Math.random() * max + min));
}
This is my go-to "weighted" random, where I use an inverse function of "x" (where x is a random between min and max) to generate a weighted result, where the minimum is the most heavy element, and the maximum the lightest (least chances of getting the result)
So basically, using weightedRandom(1, 5) means the chances of getting a 1 are higher than a 2 which are higher than a 3, which are higher than a 4, which are higher than a 5.
Might not be useful for your use case but probably useful for people googling this same question.
After a 100 iterations try, it gave me:
==================
| Result | Times |
==================
| 1 | 55 |
| 2 | 28 |
| 3 | 8 |
| 4 | 7 |
| 5 | 2 |
==================
Here are 3 solutions in javascript since I'm not sure which language you want it in. Depending on your needs one of the first two might work, but the the third one is probably the easiest to implement with large sets of numbers.
function randomSimple(){
return [0,0,0,0,0,0,0,0,1,2][Math.floor(Math.random()*10)];
}
function randomCase(){
var n=Math.floor(Math.random()*100)
switch(n){
case n<80:
return 0;
case n<90:
return 1;
case n<100:
return 2;
}
}
function randomLoop(weight,num){
var n=Math.floor(Math.random()*100),amt=0;
for(var i=0;i<weight.length;i++){
//amt+=weight[i]; *alternative method
//if(n<amt){
if(n<weight[i]){
return num[i];
}
}
}
weight=[80,90,100];
//weight=[80,10,10]; *alternative method
num=[0,1,2]
8 years late but here's my solution in 4 lines.
Prepare an array of probability mass function such that
pmf[array_index] = P(X=array_index):
var pmf = [0.8, 0.1, 0.1]
Prepare an array for the corresponding cumulative distribution function such that
cdf[array_index] = F(X=array_index):
var cdf = pmf.map((sum => value => sum += value)(0))
// [0.8, 0.9, 1]
3a) Generate a random number.
3b) Get an array of elements that are more than or equal to this number.
3c) Return its length.
var r = Math.random()
cdf.filter(el => r >= el).length
This is more or less a generic-ized version of what #trinithis wrote, in Java: I did it with ints rather than floats to avoid messy rounding errors.
static class Weighting {
int value;
int weighting;
public Weighting(int v, int w) {
this.value = v;
this.weighting = w;
}
}
public static int weightedRandom(List<Weighting> weightingOptions) {
//determine sum of all weightings
int total = 0;
for (Weighting w : weightingOptions) {
total += w.weighting;
}
//select a random value between 0 and our total
int random = new Random().nextInt(total);
//loop thru our weightings until we arrive at the correct one
int current = 0;
for (Weighting w : weightingOptions) {
current += w.weighting;
if (random < current)
return w.value;
}
//shouldn't happen.
return -1;
}
public static void main(String[] args) {
List<Weighting> weightings = new ArrayList<Weighting>();
weightings.add(new Weighting(0, 8));
weightings.add(new Weighting(1, 1));
weightings.add(new Weighting(2, 1));
for (int i = 0; i < 100; i++) {
System.out.println(weightedRandom(weightings));
}
}
How about
int [ ] numbers = { 0 , 0 , 0 , 0 , 0 , 0 , 0 , 0 , 1 , 2 } ;
then you can randomly select from numbers and 0 will have an 80% chance, 1 10%, and 2 10%
This one is in Mathematica, but it's easy to copy to another language, I use it in my games and it can handle decimal weights:
weights = {0.5,1,2}; // The weights
weights = N#weights/Total#weights // Normalize weights so that the list's sum is always 1.
min = 0; // First min value should be 0
max = weights[[1]]; // First max value should be the first element of the newly created weights list. Note that in Mathematica the first element has index of 1, not 0.
random = RandomReal[]; // Generate a random float from 0 to 1;
For[i = 1, i <= Length#weights, i++,
If[random >= min && random < max,
Print["Chosen index number: " <> ToString#i]
];
min += weights[[i]];
If[i == Length#weights,
max = 1,
max += weights[[i + 1]]
]
]
(Now I'm talking with a lists first element's index equals 0) The idea behind this is that having a normalized list weights there is a chance of weights[n] to return the index n, so the distances between the min and max at step n should be weights[n]. The total distance from the minimum min (which we put it to be 0) and the maximum max is the sum of the list weights.
The good thing behind this is that you don't append to any array or nest for loops, and that increases heavily the execution time.
Here is the code in C# without needing to normalize the weights list and deleting some code:
int WeightedRandom(List<float> weights) {
float total = 0f;
foreach (float weight in weights) {
total += weight;
}
float max = weights [0],
random = Random.Range(0f, total);
for (int index = 0; index < weights.Count; index++) {
if (random < max) {
return index;
} else if (index == weights.Count - 1) {
return weights.Count-1;
}
max += weights[index+1];
}
return -1;
}
I suggest to use a continuous check of the probability and the rest of the random number.
This function sets first the return value to the last possible index and iterates until the rest of the random value is smaller than the actual probability.
The probabilities have to sum to one.
function getRandomIndexByProbability(probabilities) {
var r = Math.random(),
index = probabilities.length - 1;
probabilities.some(function (probability, i) {
if (r < probability) {
index = i;
return true;
}
r -= probability;
});
return index;
}
var i,
probabilities = [0.8, 0.1, 0.1],
count = probabilities.map(function () { return 0; });
for (i = 0; i < 1e6; i++) {
count[getRandomIndexByProbability(probabilities)]++;
}
console.log(count);
.as-console-wrapper { max-height: 100% !important; top: 0; }
Thanks all, this was a helpful thread. I encapsulated it into a convenience function (Typescript). Tests below (sinon, jest). Could definitely be a bit tighter, but hopefully it's readable.
export type WeightedOptions = {
[option: string]: number;
};
// Pass in an object like { a: 10, b: 4, c: 400 } and it'll return either "a", "b", or "c", factoring in their respective
// weight. So in this example, "c" is likely to be returned 400 times out of 414
export const getRandomWeightedValue = (options: WeightedOptions) => {
const keys = Object.keys(options);
const totalSum = keys.reduce((acc, item) => acc + options[item], 0);
let runningTotal = 0;
const cumulativeValues = keys.map((key) => {
const relativeValue = options[key]/totalSum;
const cv = {
key,
value: relativeValue + runningTotal
};
runningTotal += relativeValue;
return cv;
});
const r = Math.random();
return cumulativeValues.find(({ key, value }) => r <= value)!.key;
};
Tests:
describe('getRandomWeightedValue', () => {
// Out of 1, the relative and cumulative values for these are:
// a: 0.1666 -> 0.16666
// b: 0.3333 -> 0.5
// c: 0.5 -> 1
const values = { a: 10, b: 20, c: 30 };
it('returns appropriate values for particular random value', () => {
// any random number under 0.166666 should return "a"
const stub1 = sinon.stub(Math, 'random').returns(0);
const result1 = randomUtils.getRandomWeightedValue(values);
expect(result1).toEqual('a');
stub1.restore();
const stub2 = sinon.stub(Math, 'random').returns(0.1666);
const result2 = randomUtils.getRandomWeightedValue(values);
expect(result2).toEqual('a');
stub2.restore();
// any random number between 0.166666 and 0.5 should return "b"
const stub3 = sinon.stub(Math, 'random').returns(0.17);
const result3 = randomUtils.getRandomWeightedValue(values);
expect(result3).toEqual('b');
stub3.restore();
const stub4 = sinon.stub(Math, 'random').returns(0.3333);
const result4 = randomUtils.getRandomWeightedValue(values);
expect(result4).toEqual('b');
stub4.restore();
const stub5 = sinon.stub(Math, 'random').returns(0.5);
const result5 = randomUtils.getRandomWeightedValue(values);
expect(result5).toEqual('b');
stub5.restore();
// any random number above 0.5 should return "c"
const stub6 = sinon.stub(Math, 'random').returns(0.500001);
const result6 = randomUtils.getRandomWeightedValue(values);
expect(result6).toEqual('c');
stub6.restore();
const stub7 = sinon.stub(Math, 'random').returns(1);
const result7 = randomUtils.getRandomWeightedValue(values);
expect(result7).toEqual('c');
stub7.restore();
});
});
Shortest solution in modern JavaScript
Note: all weights need to be integers
function weightedRandom(items){
let table = Object.entries(items)
.flatMap(([item, weight]) => Array(item).fill(weight))
return table[Math.floor(Math.random() * table.length)]
}
const key = weightedRandom({
"key1": 1,
"key2": 4,
"key3": 8
}) // returns e.g. "key1"
here is the input and ratios : 0 (80%), 1(10%) , 2 (10%)
lets draw them out so its easy to visualize.
0 1 2
-------------------------------------________+++++++++
lets add up the total weight and call it TR for total ratio. so in this case 100.
lets randomly get a number from (0-TR) or (0 to 100 in this case) . 100 being your weights total. Call it RN for random number.
so now we have TR as the total weight and RN as the random number between 0 and TR.
so lets imagine we picked a random # from 0 to 100. Say 21. so thats actually 21%.
WE MUST CONVERT/MATCH THIS TO OUR INPUT NUMBERS BUT HOW ?
lets loop over each weight (80, 10, 10) and keep the sum of the weights we already visit.
the moment the sum of the weights we are looping over is greater then the random number RN (21 in this case), we stop the loop & return that element position.
double sum = 0;
int position = -1;
for(double weight : weight){
position ++;
sum = sum + weight;
if(sum > 21) //(80 > 21) so break on first pass
break;
}
//position will be 0 so we return array[0]--> 0
lets say the random number (between 0 and 100) is 83. Lets do it again:
double sum = 0;
int position = -1;
for(double weight : weight){
position ++;
sum = sum + weight;
if(sum > 83) //(90 > 83) so break
break;
}
//we did two passes in the loop so position is 1 so we return array[1]---> 1
I have a slotmachine and I used the code below to generate random numbers. In probabilitiesSlotMachine the keys are the output in the slotmachine, and the values represent the weight.
const probabilitiesSlotMachine = [{0 : 1000}, {1 : 100}, {2 : 50}, {3 : 30}, {4 : 20}, {5 : 10}, {6 : 5}, {7 : 4}, {8 : 2}, {9 : 1}]
var allSlotMachineResults = []
probabilitiesSlotMachine.forEach(function(obj, index){
for (var key in obj){
for (var loop = 0; loop < obj[key]; loop ++){
allSlotMachineResults.push(key)
}
}
});
Now to generate a random output, I use this code:
const random = allSlotMachineResults[Math.floor(Math.random() * allSlotMachineResults.length)]
Enjoy the O(1) (constant time) solution for your problem.
If the input array is small, it can be easily implemented.
const number = Math.floor(Math.random() * 99); // Generate a random number from 0 to 99
let element;
if (number >= 0 && number <= 79) {
/*
In the range of 0 to 99, every number has equal probability
of occurring. Therefore, if you gather 80 numbers (0 to 79) and
make a "sub-group" of them, then their probabilities will get added.
Hence, what you get is an 80% chance that the number will fall in this
range.
So, quite naturally, there is 80% probability that this code will run.
Now, manually choose / assign element of your array to this variable.
*/
element = 0;
}
else if (number >= 80 && number <= 89) {
// 10% chance that this code runs.
element = 1;
}
else if (number >= 90 && number <= 99) {
// 10% chance that this code runs.
element = 2;
}
I'm trying to make it to where when a user does a certain thing, they get between 2 and 100 units. But for every 1,000 requests I want it to add up to 3,500 units given collectively.
Here's the code I have for adding different amounts randomly to a user:
if (Math.floor(Math.random() * 1000) + 1 === 900) {
//db call adding 100
}
else if (Math.floor(Math.random() * 100) + 1 === 90) {
//db call adding 40
}
else if (Math.floor(Math.random() * 30) + 1 === 20) {
//db call adding 10
}
else if (Math.floor(Math.random() * 5) + 1 === 4) {
//db call adding 5
}
else {
//db call adding 2
}
If my math is correct, this should average around 4,332 units per 1,000 calls. But obviously it would vary and I don't want that. I'd also like it to add random amounts instead, as the units added in my example are arbitrary.
EDIT: Guys, Gildor is right that I simply want to have 3,500 units, and give them away within 1,000 requests. It isn't even entirely necessary that it always reaches that maximum of 3,500 either (I could have specified that). The important thing is that I'm not giving users too much, while creating a chance for them to win a bigger amount.
Here's what I have set up now, and it's working well, and will work even better with some tweaking:
Outside of call:
var remaining = 150;
var count = 0;
Inside of call:
count += 1;
if (count === 100) {
remaining = 150;
count = 0;
}
if (Math.floor(Math.random() * 30) + 1 === 20) {
var addAmount = Math.floor(Math.random() * 85) + 15;
if (addAmount <= remaining) {
remaining -= addAmount;
//db call adding addAmount + 2
}
else {
//db call adding 2
}
}
else if (Math.floor(Math.random() * 5) + 1 === 4) {
var addAmount1 = Math.floor(Math.random() * 10) + 1;
if (addAmount1 <= remaining) {
remaining -= addAmount1;
//db call adding addAmount1 + 2
}
else {
//db call adding 2
}
}
else {
//db call adding 2
}
I guess I should have clarified, I want a "random" number with a high likelihood of being small. That's kind of part of the gimmick, where you have low probability of getting a larger amount.
As I've commented, 1,000 random numbers between 2 and 100 that add up to 3,500 is an average number of 3.5 which is not consistent with random choices between 2 and 100. You'd have to have nearly all 2 and 3 values in order to achieve that and, in fact couldn't have more than a couple large numbers. Nothing even close to random. So, for this to even be remotely random and feasible, you'd have to pick a total much large than 3,500. A random total of 1,000 numbers between 2 and 100 would be more like 51,000.
Furthermore, you can't dynamically generate each number in a truly random fashion and guarantee a particular total. The main way to guarantee that outcome is to pre-allocate random numbers that add up to the total that are known to achieve that and then random select each number from the pre-allocated scheme, then remove that from the choice for future selections.
You could also try to keep a running total and bias your randomness if you get skewed away form your total, but doing it that way, the last set of numbers may have to be not even close to random in order to hit your total consistently.
A scheme that could work if you reset the total to support what it should be for actual randomness (e.g. to 51,000) would be to preallocated an array of 500 random numbers between 2 and 100 and then add another 500 numbers that are the complements of those. This guarantees the 51 avg number. You can then select each number randomly from the pre-allocated array and then remove it form the array so it won't be selected again. I can add code to do this in a second.
function RandResults(low, high, qty) {
var results = new Array(qty);
var limit = qty/2;
var avg = (low + high) / 2;
for (var i = 0; i < limit; i++) {
results[i] = Math.floor((Math.random() * (high - low)) + low);
//
results[qty - i - 1] = (2 * avg) - results[i];
}
this.results = results;
}
RandResults.prototype.getRand = function() {
if (!this.results.length) {
throw new Error("getRand() called, but results are empty");
}
var randIndex = Math.floor(Math.random() * this.results.length);
var value = this.results[randIndex];
this.results.splice(randIndex, 1);
return value;
}
RandResults.prototype.getRemaining = function() {
return this.results.length;
}
var randObj = new RandResults(2, 100, 1000);
// get next single random value
if (randObj.getRemaining()) {
var randomValue = randObj.getRand();
}
Working demo for a truly random selection of numbers that add up to 51,000 (which is what 1,000 random values between 2 and 100 should add up to): http://jsfiddle.net/jfriend00/wga26n7p/
If what you want is the following: 1,000 numbers that add up to 3,500 and are selected from between the range 2 to 100 (inclusive) where most numbers will be 2 or 3, but occasionally something could be up to 100, then that's a different problem. I wouldn't really use the word random to describe it because it's a highly biased selection.
Here's a way to do that. It generates 1,000 random numbers between 2 and 100, keeping track of the total. Then, afterwards it corrects the random numbers to hit the right total by randomly selected values and decrementing them until the total is down to 3,500. You can see it work here: http://jsfiddle.net/jfriend00/m4ouonj4/
The main part of the code is this:
function RandResults(low, high, qty, total) {
var results = new Array(qty);
var runningTotal = 0, correction, index, trial;
for (var i = 0; i < qty; i++) {
runningTotal += results[i] = Math.floor((Math.random() * (high - low)) + low);
}
// now, correct to hit the total
if (runningTotal > total) {
correction = -1;
} else if (runningTotal < total) {
correction = 1;
}
// loop until we've hit the total
// randomly select a value to apply the correction to
while (runningTotal !== total) {
index = Math.floor(Math.random() * qty);
trial = results[index] + correction;
if (trial >= low && trial <= high) {
results[index] = trial;
runningTotal += correction;
}
}
this.results = results;
}
This meets an objective of a biased total of 3,500 and all numbers between 2 and 100, though the probability of a 2 in this scheme is very high and the probably of a 100 in this scheme is almost non-existent.
And, here's a weighted random generator that adds up to a precise total. This uses a cubic weighting scheme to favor the lower numbers (the probably of a number goes down with the cube of the number) and then after the random numbers are generated, a correction algorithm applies random corrections to the numbers to make the total come out exactly as specified. The code for a working demo is here: http://jsfiddle.net/jfriend00/g6mds8rr/
function RandResults(low, high, numPicks, total) {
var avg = total / numPicks;
var i, j;
// calculate probabilities for each value
// by trial and error, we found that a cubic weighting
// gives an approximately correct sub-total that can then
// be corrected to the exact total
var numBuckets = high - low + 1;
var item;
var probabilities = [];
for (i = 0; i < numBuckets; i++) {
item = low + i;
probabilities[i] = avg / (item * item * item);
}
// now using those probabilities, create a steps array
var sum = 0;
var steps = probabilities.map(function(item) {
sum += item;
return sum;
});
// now generate a random number and find what
// index it belongs to in the steps array
// and use that as our pick
var runningTotal = 0, rand;
var picks = [], pick, stepsLen = steps.length;
for (i = 0; i < numPicks; i++) {
rand = Math.random() * sum;
for (j = 0; j < stepsLen; j++) {
if (steps[j] >= rand) {
pick = j + low;
picks.push(pick);
runningTotal += pick;
break;
}
}
}
var correction;
// now run our correction algorithm to hit the total exactly
if (runningTotal > total) {
correction = -1;
} else if (runningTotal < total) {
correction = 1;
}
// loop until we've hit the total
// randomly select a value to apply the correction to
while (runningTotal !== total) {
index = Math.floor(Math.random() * numPicks);
trial = picks[index] + correction;
if (trial >= low && trial <= high) {
picks[index] = trial;
runningTotal += correction;
}
}
this.results = picks;
}
RandResults.prototype.getRand = function() {
if (!this.results.length) {
throw new Error("getRand() called, but results are empty");
}
return this.results.pop();
}
RandResults.prototype.getAllRand = function() {
if (!this.results.length) {
throw new Error("getAllRand() called, but results are empty");
}
var r = this.results;
this.results = [];
return r;
}
RandResults.prototype.getRemaining = function() {
return this.results.length;
}
As some comments pointed out... the numbers in the question does not quite make sense, but conceptually there are two approaches: calculate dynamically just in time or ahead of time.
To calculate just in time:
You can maintain a remaining variable which tracks how many of 3500 left. Each time when you randomly give some units, subtract the number from remaining until it goes to 0.
In addition, to make sure each time at least 2 units are given, you can start with remaining = 1500 and give random + 2 units each time.
To prevent cases that after 1000 gives there are still balances left, you may need to add some logic to give units more aggressively towards the last few times. However it will result in not-so-random results.
To calculate ahead of time:
Generate a random list with 1000 values in [2, 100] and sums up to 3500. Then shuffle the list. Each time you want to give some units, pick the next item in the array. After 1000 gives, generate another list in the same way. This way you get much better randomized results.
Be aware that both approaches requires some kind of shared state that needs to be handled carefully in a multi-threaded environment.
Hope the ideas help.
I've got a little app that recalculates the apportionment of seats in Congress in each state as the user changes the population hypothetically by moving counties between states. There are functionally infinite combinations, so I need to compute this on the fly.
The method is fairly straightforward: You give each state 1 seat, then assign the remaining 385 iteratively by weighting them according to population / ((seats * (seats + 1)) and assigning the seat to the top priority state.
I've got this working fine the obvious way:
function apportion(states) {
var totalReps = 435;
// assign one seat to each state
states.forEach(function(state) {
state.totalReps = 1;
totalReps -= 1;
state.priority = state.data.population / Math.sqrt(2); //Calculate default quota
});
// sort function
var topPriority = function(a, b) {
return b.priority - a.priority;
};
// assign the remaining 385
for (totalReps; totalReps > 0; totalReps -= 1) {
states.sort(topPriority);
states[0].totalReps += 1;
// recalculate the priority for this state
states[0].priority = states[0].data.population / Math.sqrt(states[0].totalReps * (states[0].totalReps + 1));
}
return states;
}
However, it drags a little when called several times a second. I'm wondering whether there's a better way to place the state that received the seat back in the sorted array other than by resorting the whole array. I don't know a ton about the Javascript sort() function and whether it will already do this with maximal efficiency without being told that all but the first element in the array is already sorted. Is there a more efficient way that I can implement by hand?
jsFiddle here: http://jsfiddle.net/raphaeljs/zoyLb9g6/1/
Using a strategy of avoiding sorts, the following keeps an array of priorities that is aligned with the states object and uses Math.max to find the highest priority value, then indexOf to find its position in the array, then updates the states object and priorities array.
As with all performance optimisations, it has very different results in different browsers (see http://jsperf.com/calc-reps), but is at least no slower (Chrome) and up to 4 times faster (Firefox).
function apportion1(states) {
var totalReps = 435;
var sqrt2 = Math.sqrt(2);
var priorities = [];
var max, idx, state, n;
// assign one seat to each state
states.forEach(function(state) {
state.totalReps = 1;
state.priority = state.data.population / sqrt2; //Calculate default quota
priorities.push(state.priority);
});
totalReps -= states.length;
while (totalReps--) {
max = Math.max.apply(Math, priorities);
idx = priorities.indexOf(max);
state = states[idx];
n = ++state.totalReps;
state.priority = state.data.population / Math.sqrt(n * ++n);
priorities[idx] = state.priority;
}
return states;
}
For testing I used an assumed states object with only 5 states, but real population data. Hopefully, with the full 50 states the benefit will be larger.
Another strategy is to sort on population since that's how the priorities are distributed, assign at least one rep to each state and calculate the priority, then run from 0 adding reps and recalculating priorities. There will be a threshold below which a state should not get any more reps.
Over to you. ;-)
Edit
Here's a really simple method that apportions based on population. If may allocation one too many or one too few. In the first case, find the state with the lowest priority and at least 2 reps (and recalc priority if you want) and take a rep away. In the second, find the state with the highest priority and add one rep (and recalc priority if required).
function simple(states) {
var totalPop = 0;
var totalReps = 435
states.forEach(function(state){totalPop += state.data.population});
var popperrep = totalPop/totalReps;
states.forEach(function(state){
state.totalReps = Math.round(state.data.population / popperrep);
state.priority = state.data.population / Math.sqrt(state.totalReps * (state.totalReps + 1));
});
return states;
}
Untested, but I'll bet it's very much faster than the others. ;-)
I've updated the test example for the simple function to adjust if the distribution results in an incorrect total number of reps. Tested across a variety of scenarios, it gives identical results to the original code even though it uses a very different algorithm. It's several hundred times faster than the original with the full 50 states.
Here's the final version of the simple function:
function simple(states) {
var count = 0;
var state, diff;
var totalPop = states.reduce(function(prev, curr){return prev + curr.data.population},0);
var totalReps = 435
var popperrep = totalPop/totalReps;
states.forEach(function(state){
state.totalReps = Math.round(state.data.population / popperrep) || 1;
state.priority = state.data.population / Math.sqrt(state.totalReps * (state.totalReps + 1));
count += state.totalReps;
});
// If too many reps distributed, trim from lowest priority with 2 or more
// If not enough reps distributed, add to highest priority
while ((diff = count - totalReps)) {
state = states[getPriority(diff < 0)];
state.totalReps += diff > 0? -1 : 1;
count += diff > 0? -1 : 1;
state.priority = state.data.population / Math.sqrt(state.totalReps * (state.totalReps + 1));
// console.log('Adjusted ' + state.data.name + ' ' + diff);
}
return states;
// Get lowest priority state with 2 or more reps,
// or highest priority state if high is true
function getPriority(high) {
var idx, p = high? 0 : +Infinity;
states.forEach(function(state, i){
if (( high && state.priority > p) || (!high && state.totalReps > 1 && state.priority < p)) {
p = state.priority;
idx = i;
}
});
return idx;
}
}
I need to come up with an algorithm that does the following:
Lets say you have an array of positive numbers (e.g. [1,3,7,0,0,9]) and you know beforehand their sum is 20.
You want to abstract some average amount from each number such that the new sum would be less by 7.
To do so, you must follow these rules:
you can only subtract integers
the resulting array must not have any negative values
you can not make any changes to the indices of the buckets.
The more uniformly the subtraction is distributed over the array the better.
Here is my attempt at an algorithm in JavaScript + underscore (which will probably make it n^2):
function distributeSubtraction(array, goal){
var sum = _.reduce(arr, function(x, y) { return x + y; }, 0);
if(goal < sum){
while(goal < sum && goal > 0){
var less = ~~(goal / _.filter(arr, _.identity).length); //length of array without 0s
arr = _.map(arr, function(val){
if(less > 0){
return (less < val) ? val - less : val; //not ideal, im skipping some!
} else {
if(goal > 0){ //again not ideal. giving preference to start of array
if(val > 0) {
goal--;
return val - 1;
}
} else {
return val;
}
}
});
if(goal > 0){
var newSum = _.reduce(arr, function(x, y) { return x + y; }, 0);
goal -= sum - newSum;
sum = newSum;
} else {
return arr;
}
}
} else if(goal == sum) {
return _.map(arr, function(){ return 0; });
} else {
return arr;
}
}
var goal = 7;
var arr = [1,3,7,0,0,9];
var newArray = distributeSubtraction(arr, goal);
//returned: [0, 1, 5, 0, 0, 7];
Well, that works but there must be a better way! I imagine the run time of this thing will be terrible with bigger arrays and bigger numbers.
edit: I want to clarify that this question is purely academic. Think of it like an interview question where you whiteboard something and the interviewer asks you how your algorithm would behave on a different type of a dataset.
It sounds like you want to subtract a weighted amount from each number. I.E you want to subtract X/sum * amount_to_subtract from each item. You would of course need to round the amount your subtracting. The problem is then making sure that you've subtracted the total correct amount. Also, this depends on your input: are you guaranteeing that that the amount you want to subtract can be subtracted? Here's a rough python implementation, (I think):
def uniform_array_reduction(inp, amount):
total = sum(inp)
if amount > total:
raise RuntimeError('Can\'t remove more than there is')
if amount == total: #special case
return [0] * len(inp)
removed = 0
output = []
for i in inp:
if removed < amount:
to_remove = int(round(float(i)/float(total)*float(amount)))
output.append(i - to_remove)
removed += to_remove
else:
output.append(i)
# if we didn't remove enough, just remove 1 from
# each element until we've hit our mark.
# shouldn't require more than one pass
while removed < amount:
for i in range(len(output)):
if output[i] > 0:
output[i] -= 1
removed += 1
if removed == amount:
break
return output
EDIT: I've fixed a few bugs in the code.
s = Sum(x) - required_sum
do:
a = ceil( s/number_of_non_zeros(x) )
For i=1 to length(x):
v = min(a, x[i], s)
x[i]-=v
s-=v
while s>0
This version needs no sorting.