I'm creating mock data for my app with fakerJS.
I have created one array of 10 000 user IDs. And 1 array of 100 000 content IDs.
I want to distribute these content IDs over the user ID array where the first one will get the most content IDs and the last user will get the least (0).
Eg.
const userIds = ['a', 'b', 'c', ...] // 10_000 long
const contentIds = ['c1', 'c2', 'c3', ...] // 100_000 long
const result = distribute(userIds, contentIds)
result // { a: ['c1', 'c2', 'c3', ...], b: [...], z: [] }
The distribution should look something like this theoretically:
However, the 40 here for highest number is way too low, so imagine this being way higher for the first user ID and with 10_000 users, many of the last users could have 0 content IDs with this distribution basically.
I've been coding for 6+ years, but I think I need to start learning Mathematics to figure this one out, would appreciate any help on how to even start 😅
Maybe a logarithm decay and truncate (you can play with value_data.push formula until you find what you want)
Start and End
0: 90
1: 83
2: 79
3: 76
4: 74
5: 72
...
9995: 0
9996: 0
9997: 0
9998: 0
9999: 0
<script>
var time_data = [];
var value_data = [];
max = 10000;
//logarithmic decay array and truncate
for (let i=1;i<=max;++i) {
time_data.push(i);
value_data.push( Math.floor(100 - ( (9.71672 * (1 + Math.log(i)) ) )) );
}
//initializer
guarda = 0;
valueGuarda = 0;
for (let index = 0; index < time_data.length; index++) {
guarda = time_data[index] + guarda;
valueGuarda = value_data[index] + valueGuarda;
}
//arrays
console.log(time_data)
console.log(value_data)
//calculating the summation
let total = value_data.reduce((a, b) => a + b, 0); //100000
console.log("total: " + total);
</script>
I'm trying to devise a (good) way to choose a random number from a range of possible numbers where each number in the range is given a weight. To put it simply: given the range of numbers (0,1,2) choose a number where 0 has an 80% probability of being selected, 1 has a 10% chance and 2 has a 10% chance.
It's been about 8 years since my college stats class, so you can imagine the proper formula for this escapes me at the moment.
Here's the 'cheap and dirty' method that I came up with. This solution uses ColdFusion. Yours may use whatever language you'd like. I'm a programmer, I think I can handle porting it. Ultimately my solution needs to be in Groovy - I wrote this one in ColdFusion because it's easy to quickly write/test in CF.
public function weightedRandom( Struct options ) {
var tempArr = [];
for( var o in arguments.options )
{
var weight = arguments.options[ o ] * 10;
for ( var i = 1; i<= weight; i++ )
{
arrayAppend( tempArr, o );
}
}
return tempArr[ randRange( 1, arrayLen( tempArr ) ) ];
}
// test it
opts = { 0=.8, 1=.1, 2=.1 };
for( x = 1; x<=10; x++ )
{
writeDump( weightedRandom( opts ) );
}
I'm looking for better solutions, please suggest improvements or alternatives.
Rejection sampling (such as in your solution) is the first thing that comes to mind, whereby you build a lookup table with elements populated by their weight distribution, then pick a random location in the table and return it. As an implementation choice, I would make a higher order function which takes a spec and returns a function which returns values based on the distribution in the spec, this way you avoid having to build the table for each call. The downsides are that the algorithmic performance of building the table is linear by the number of items and there could potentially be a lot of memory usage for large specs (or those with members with very small or precise weights, e.g. {0:0.99999, 1:0.00001}). The upside is that picking a value has constant time, which might be desirable if performance is critical. In JavaScript:
function weightedRand(spec) {
var i, j, table=[];
for (i in spec) {
// The constant 10 below should be computed based on the
// weights in the spec for a correct and optimal table size.
// E.g. the spec {0:0.999, 1:0.001} will break this impl.
for (j=0; j<spec[i]*10; j++) {
table.push(i);
}
}
return function() {
return table[Math.floor(Math.random() * table.length)];
}
}
var rand012 = weightedRand({0:0.8, 1:0.1, 2:0.1});
rand012(); // random in distribution...
Another strategy is to pick a random number in [0,1) and iterate over the weight specification summing the weights, if the random number is less than the sum then return the associated value. Of course, this assumes that the weights sum to one. This solution has no up-front costs but has average algorithmic performance linear by the number of entries in the spec. For example, in JavaScript:
function weightedRand2(spec) {
var i, sum=0, r=Math.random();
for (i in spec) {
sum += spec[i];
if (r <= sum) return i;
}
}
weightedRand2({0:0.8, 1:0.1, 2:0.1}); // random in distribution...
Generate a random number R between 0 and 1.
If R in [0, 0.1) -> 1
If R in [0.1, 0.2) -> 2
If R in [0.2, 1] -> 3
If you can't directly get a number between 0 and 1, generate a number in a range that will produce as much precision as you want. For example, if you have the weights for
(1, 83.7%) and (2, 16.3%), roll a number from 1 to 1000. 1-837 is a 1. 838-1000 is 2.
I use the following
function weightedRandom(min, max) {
return Math.round(max / (Math.random() * max + min));
}
This is my go-to "weighted" random, where I use an inverse function of "x" (where x is a random between min and max) to generate a weighted result, where the minimum is the most heavy element, and the maximum the lightest (least chances of getting the result)
So basically, using weightedRandom(1, 5) means the chances of getting a 1 are higher than a 2 which are higher than a 3, which are higher than a 4, which are higher than a 5.
Might not be useful for your use case but probably useful for people googling this same question.
After a 100 iterations try, it gave me:
==================
| Result | Times |
==================
| 1 | 55 |
| 2 | 28 |
| 3 | 8 |
| 4 | 7 |
| 5 | 2 |
==================
Here are 3 solutions in javascript since I'm not sure which language you want it in. Depending on your needs one of the first two might work, but the the third one is probably the easiest to implement with large sets of numbers.
function randomSimple(){
return [0,0,0,0,0,0,0,0,1,2][Math.floor(Math.random()*10)];
}
function randomCase(){
var n=Math.floor(Math.random()*100)
switch(n){
case n<80:
return 0;
case n<90:
return 1;
case n<100:
return 2;
}
}
function randomLoop(weight,num){
var n=Math.floor(Math.random()*100),amt=0;
for(var i=0;i<weight.length;i++){
//amt+=weight[i]; *alternative method
//if(n<amt){
if(n<weight[i]){
return num[i];
}
}
}
weight=[80,90,100];
//weight=[80,10,10]; *alternative method
num=[0,1,2]
8 years late but here's my solution in 4 lines.
Prepare an array of probability mass function such that
pmf[array_index] = P(X=array_index):
var pmf = [0.8, 0.1, 0.1]
Prepare an array for the corresponding cumulative distribution function such that
cdf[array_index] = F(X=array_index):
var cdf = pmf.map((sum => value => sum += value)(0))
// [0.8, 0.9, 1]
3a) Generate a random number.
3b) Get an array of elements that are more than or equal to this number.
3c) Return its length.
var r = Math.random()
cdf.filter(el => r >= el).length
This is more or less a generic-ized version of what #trinithis wrote, in Java: I did it with ints rather than floats to avoid messy rounding errors.
static class Weighting {
int value;
int weighting;
public Weighting(int v, int w) {
this.value = v;
this.weighting = w;
}
}
public static int weightedRandom(List<Weighting> weightingOptions) {
//determine sum of all weightings
int total = 0;
for (Weighting w : weightingOptions) {
total += w.weighting;
}
//select a random value between 0 and our total
int random = new Random().nextInt(total);
//loop thru our weightings until we arrive at the correct one
int current = 0;
for (Weighting w : weightingOptions) {
current += w.weighting;
if (random < current)
return w.value;
}
//shouldn't happen.
return -1;
}
public static void main(String[] args) {
List<Weighting> weightings = new ArrayList<Weighting>();
weightings.add(new Weighting(0, 8));
weightings.add(new Weighting(1, 1));
weightings.add(new Weighting(2, 1));
for (int i = 0; i < 100; i++) {
System.out.println(weightedRandom(weightings));
}
}
How about
int [ ] numbers = { 0 , 0 , 0 , 0 , 0 , 0 , 0 , 0 , 1 , 2 } ;
then you can randomly select from numbers and 0 will have an 80% chance, 1 10%, and 2 10%
This one is in Mathematica, but it's easy to copy to another language, I use it in my games and it can handle decimal weights:
weights = {0.5,1,2}; // The weights
weights = N#weights/Total#weights // Normalize weights so that the list's sum is always 1.
min = 0; // First min value should be 0
max = weights[[1]]; // First max value should be the first element of the newly created weights list. Note that in Mathematica the first element has index of 1, not 0.
random = RandomReal[]; // Generate a random float from 0 to 1;
For[i = 1, i <= Length#weights, i++,
If[random >= min && random < max,
Print["Chosen index number: " <> ToString#i]
];
min += weights[[i]];
If[i == Length#weights,
max = 1,
max += weights[[i + 1]]
]
]
(Now I'm talking with a lists first element's index equals 0) The idea behind this is that having a normalized list weights there is a chance of weights[n] to return the index n, so the distances between the min and max at step n should be weights[n]. The total distance from the minimum min (which we put it to be 0) and the maximum max is the sum of the list weights.
The good thing behind this is that you don't append to any array or nest for loops, and that increases heavily the execution time.
Here is the code in C# without needing to normalize the weights list and deleting some code:
int WeightedRandom(List<float> weights) {
float total = 0f;
foreach (float weight in weights) {
total += weight;
}
float max = weights [0],
random = Random.Range(0f, total);
for (int index = 0; index < weights.Count; index++) {
if (random < max) {
return index;
} else if (index == weights.Count - 1) {
return weights.Count-1;
}
max += weights[index+1];
}
return -1;
}
I suggest to use a continuous check of the probability and the rest of the random number.
This function sets first the return value to the last possible index and iterates until the rest of the random value is smaller than the actual probability.
The probabilities have to sum to one.
function getRandomIndexByProbability(probabilities) {
var r = Math.random(),
index = probabilities.length - 1;
probabilities.some(function (probability, i) {
if (r < probability) {
index = i;
return true;
}
r -= probability;
});
return index;
}
var i,
probabilities = [0.8, 0.1, 0.1],
count = probabilities.map(function () { return 0; });
for (i = 0; i < 1e6; i++) {
count[getRandomIndexByProbability(probabilities)]++;
}
console.log(count);
.as-console-wrapper { max-height: 100% !important; top: 0; }
Thanks all, this was a helpful thread. I encapsulated it into a convenience function (Typescript). Tests below (sinon, jest). Could definitely be a bit tighter, but hopefully it's readable.
export type WeightedOptions = {
[option: string]: number;
};
// Pass in an object like { a: 10, b: 4, c: 400 } and it'll return either "a", "b", or "c", factoring in their respective
// weight. So in this example, "c" is likely to be returned 400 times out of 414
export const getRandomWeightedValue = (options: WeightedOptions) => {
const keys = Object.keys(options);
const totalSum = keys.reduce((acc, item) => acc + options[item], 0);
let runningTotal = 0;
const cumulativeValues = keys.map((key) => {
const relativeValue = options[key]/totalSum;
const cv = {
key,
value: relativeValue + runningTotal
};
runningTotal += relativeValue;
return cv;
});
const r = Math.random();
return cumulativeValues.find(({ key, value }) => r <= value)!.key;
};
Tests:
describe('getRandomWeightedValue', () => {
// Out of 1, the relative and cumulative values for these are:
// a: 0.1666 -> 0.16666
// b: 0.3333 -> 0.5
// c: 0.5 -> 1
const values = { a: 10, b: 20, c: 30 };
it('returns appropriate values for particular random value', () => {
// any random number under 0.166666 should return "a"
const stub1 = sinon.stub(Math, 'random').returns(0);
const result1 = randomUtils.getRandomWeightedValue(values);
expect(result1).toEqual('a');
stub1.restore();
const stub2 = sinon.stub(Math, 'random').returns(0.1666);
const result2 = randomUtils.getRandomWeightedValue(values);
expect(result2).toEqual('a');
stub2.restore();
// any random number between 0.166666 and 0.5 should return "b"
const stub3 = sinon.stub(Math, 'random').returns(0.17);
const result3 = randomUtils.getRandomWeightedValue(values);
expect(result3).toEqual('b');
stub3.restore();
const stub4 = sinon.stub(Math, 'random').returns(0.3333);
const result4 = randomUtils.getRandomWeightedValue(values);
expect(result4).toEqual('b');
stub4.restore();
const stub5 = sinon.stub(Math, 'random').returns(0.5);
const result5 = randomUtils.getRandomWeightedValue(values);
expect(result5).toEqual('b');
stub5.restore();
// any random number above 0.5 should return "c"
const stub6 = sinon.stub(Math, 'random').returns(0.500001);
const result6 = randomUtils.getRandomWeightedValue(values);
expect(result6).toEqual('c');
stub6.restore();
const stub7 = sinon.stub(Math, 'random').returns(1);
const result7 = randomUtils.getRandomWeightedValue(values);
expect(result7).toEqual('c');
stub7.restore();
});
});
Shortest solution in modern JavaScript
Note: all weights need to be integers
function weightedRandom(items){
let table = Object.entries(items)
.flatMap(([item, weight]) => Array(item).fill(weight))
return table[Math.floor(Math.random() * table.length)]
}
const key = weightedRandom({
"key1": 1,
"key2": 4,
"key3": 8
}) // returns e.g. "key1"
here is the input and ratios : 0 (80%), 1(10%) , 2 (10%)
lets draw them out so its easy to visualize.
0 1 2
-------------------------------------________+++++++++
lets add up the total weight and call it TR for total ratio. so in this case 100.
lets randomly get a number from (0-TR) or (0 to 100 in this case) . 100 being your weights total. Call it RN for random number.
so now we have TR as the total weight and RN as the random number between 0 and TR.
so lets imagine we picked a random # from 0 to 100. Say 21. so thats actually 21%.
WE MUST CONVERT/MATCH THIS TO OUR INPUT NUMBERS BUT HOW ?
lets loop over each weight (80, 10, 10) and keep the sum of the weights we already visit.
the moment the sum of the weights we are looping over is greater then the random number RN (21 in this case), we stop the loop & return that element position.
double sum = 0;
int position = -1;
for(double weight : weight){
position ++;
sum = sum + weight;
if(sum > 21) //(80 > 21) so break on first pass
break;
}
//position will be 0 so we return array[0]--> 0
lets say the random number (between 0 and 100) is 83. Lets do it again:
double sum = 0;
int position = -1;
for(double weight : weight){
position ++;
sum = sum + weight;
if(sum > 83) //(90 > 83) so break
break;
}
//we did two passes in the loop so position is 1 so we return array[1]---> 1
I have a slotmachine and I used the code below to generate random numbers. In probabilitiesSlotMachine the keys are the output in the slotmachine, and the values represent the weight.
const probabilitiesSlotMachine = [{0 : 1000}, {1 : 100}, {2 : 50}, {3 : 30}, {4 : 20}, {5 : 10}, {6 : 5}, {7 : 4}, {8 : 2}, {9 : 1}]
var allSlotMachineResults = []
probabilitiesSlotMachine.forEach(function(obj, index){
for (var key in obj){
for (var loop = 0; loop < obj[key]; loop ++){
allSlotMachineResults.push(key)
}
}
});
Now to generate a random output, I use this code:
const random = allSlotMachineResults[Math.floor(Math.random() * allSlotMachineResults.length)]
Enjoy the O(1) (constant time) solution for your problem.
If the input array is small, it can be easily implemented.
const number = Math.floor(Math.random() * 99); // Generate a random number from 0 to 99
let element;
if (number >= 0 && number <= 79) {
/*
In the range of 0 to 99, every number has equal probability
of occurring. Therefore, if you gather 80 numbers (0 to 79) and
make a "sub-group" of them, then their probabilities will get added.
Hence, what you get is an 80% chance that the number will fall in this
range.
So, quite naturally, there is 80% probability that this code will run.
Now, manually choose / assign element of your array to this variable.
*/
element = 0;
}
else if (number >= 80 && number <= 89) {
// 10% chance that this code runs.
element = 1;
}
else if (number >= 90 && number <= 99) {
// 10% chance that this code runs.
element = 2;
}
The snippet below pretty much says it all, but in short I need to distribute a certain amount of months equally over activities. Since there is always a chance to deal with a remainder these should be added to the first month.
const selectedMonth = 5
const project = {
duration: 2, // in months
activities: [{
number: 1,
title: 'game 1'
},
{
number: 2,
title: 'game 2'
},
{
number: 3,
title: 'game 3'
},
]
}
// 1 Add a "plannedInMonth" property to each activity
// 2 Start planning from the selected month and onwards (the month number can be > 12)
// 3 Spread the activities evenly based on the duration
function planActivitiesInMonths() {}
planActivitiesInMonths()
// So this function should, since the remainder of 3 / 2 = 1, return as follows:
activities: [{
number: 1,
title: 'game 1',
plannedInMonth: 5
},
{
number: 2,
title: 'game 2',
plannedInMonth: 5
},
{
number: 3,
title: 'game 3',
plannedInMonth: 6
},
]
// However, it should also work when e.g. 24 activities need to be distributed across 5 months
If you're just looking to copy paste an implementation of the algorithm, this should do it:
function planActivitiesInMonths(project, selectedMonth) {
const remainder = project.activities.length % project.duration
const activitesPerMonth = Math.floor(project.activities.length / project.duration)
return project.activities.map((activity, i) => {
let index = Math.floor((i - remainder) / activitesPerMonth)
if (index < 0) {
index = 0
}
activity.plannedInMonth = index + selectedMonth
return activity
})
}
Just keep in mind that my function returns a value and doesn't mutate directly the object.
I am shifting the index by the remainder to have to be able to nicely handle the fact that the remainder activities should be added to the first month, but there are a tons of ways of implementing this algorithm.
However, this algorithm has a strange behaviour if the project duration is slightly below a multiple of the activities per month. In this case, the remainder would be very big and a lot of activities would be added to the first month.
For example, if you want to distribute 9 activities across 5 months, the remainder would be 5 % 9 = 4, so the first month would have a total of 5 activities!
Maybe it's better to evenly distribute the remainder too. And this algorithm has a cleaner and simpler implementation:
function planActivitiesInMonths(project, selectedMonth) {
const activitesPerMonth = project.activities.length / project.duration
return project.activities.map((activity, i) => {
const index = Math.floor(i / project.activities.length * activitesPerMonth)
activity.plannedInMonth = index + selectedMonth
return activity
})
}
I want to give the user a prize when he signs in;
but it needs to be there some rare prizes so I want to appear prizes with different chances to appear using percents
i want to display one of these
[50 : 'flower'], [30 : 'book'], [20 : 'mobile'];
using percents they have
if there any way using Node.js or just javascript functions it would be great
You can create a function to get weighted random results, something like this:
const weightedSample = (items) => {
// cache if necessary; in Chrome, seems to make little difference
const total = Object.values(items).reduce((sum, weight) => sum + weight, 0)
const rnd = Math.random() * total
let accumulator = 0
for (const [item, weight] of Object.entries(items)) {
accumulator += weight
if (rnd < accumulator) {
return item
}
}
}
// check frequencies of each result
const prizes = { flower: 50, book: 30, mobile: 20 }
const results = Object.fromEntries(Object.keys(prizes).map(k => [k, 0]))
for (let i = 0; i < 1e6; ++i) {
const prize = weightedSample(prizes)
++results[prize]
}
// sample results: { flower: 500287, book: 299478, mobile: 200235 }
console.log(results)
This will work regardless of whether the weights add up to 100, whether they're integers, and so on.
'Right off the top of my head'-approach would be to prepare an array where each source item occurs the number of times that corresponds to respective probability and pick random item out of that array (assuming probability value has no more than 2 decimal places):
// main function
const getPseudoRandom = items => {
const {min, random} = Math,
commonMultiplier = 100,
itemBox = []
for(item in items){
for(let i = 0; i < items[item]*commonMultiplier; i++){
const randomPosition = 0|random()*itemBox.length
itemBox.splice(randomPosition, 0, item)
}
}
return itemBox[0|random()*itemBox.length]
}
// test of random outcomes distribution
const outcomes = Array(1000)
.fill()
.map(_ => getPseudoRandom({'flower': 0.5, 'book': 0.3, 'mobile': 0.2})),
distribution = outcomes.reduce((acc, item, _, s) =>
(acc[item] = (acc[item]||0)+100/s.length, acc), {})
console.log(distribution)
.as-console-wrapper{min-height:100%;}
While above approach may seem easy to comprehend and deploy, you may consider another one - build up the sort of probability ranges of respective width and have your random value falling into one of those - the wider the range, the greater probability:
const items = {'flower': 0.5, 'book': 0.2, 'mobile': 0.2, '1mUSD': 0.1},
// main function
getPseudoRandom = items => {
let totalWeight = 0,
ranges = [],
rnd = Math.random()
for(const itemName in items){
ranges.push({
itemName,
max: totalWeight += items[itemName]
})
}
return ranges
.find(({max}) => max > rnd*totalWeight)
.itemName
},
// test of random outcomes distribution
outcomes = Array(1000)
.fill()
.map(_ => getPseudoRandom(items)),
distribution = outcomes.reduce((acc, item, _, s) =>
(acc[item] = (acc[item]||0)+100/s.length, acc), {})
console.log(distribution)
"Certain probability" and "random" could lead to different approaches!
If you want random each time, something like:
let chances = [[0.2,'mobile'],[0.5,'book'],[1.0,'flower']]
let val = Math.random() // floating number from 0 to 1.0
let result = chances.find( c => c[0] <= val )[1]
This will give a random result each time. It could be possible to get 'mobile' 100 times in a row! Rare, of course, but a good random number generate will let that happen.
But perhaps you want to ensure that, in 100 results, you only hand out 20 mobiles, 30 books, and 50 flowers. Then you might want a "random array" for each user. Pre-fill the all the slots and remove them as they are used. Something like:
// when setting up a new user
let userArray = []
let chances = [[20,'mobile'],[30,'book'],[50,'flower']]
changes.forEach( c => {
for(let i = 0; i < c[0]; i++) userArray.push(c[1])
})
// save userArray, which has exactly 100 values
// then, when picking a random value for a user, find an index in the current length
let index = Math.floor(Math.random() * userArray.length)
let result = userArray[index]
userArray.splice(index,1) // modify and save userArray for next login
if(userArray.length === 0) reinitializeUserArray()
There are different approaches to this, but just some ideas to get you started.
I have the following array with two objects:
var myArr = [{
id: 3,
licences: 100
new_value_pr_licence: 40
}, {
id: 4,
licences: 200
new_value_pr_licence: 25
}]
A user wish to buy 150 licences. This means that they fall into the category 100 because they are above 100 licences but below 200 which means they pay $40 per licence.
Note that the array object values varies.
Order your plans by the price per licence:
myArr.sort(function (a, b) {
return a.new_value_pr_licence - b.new_value_pr_licence;
})
then starting from the start of the array, take as many of that plan as you can without going over the number the user wants to buy:
var numUserWants = 150;
var purchases = {};
var cheapestAvailableProduct = myArr.shift();
while (numUserWants > 0 && cheapestAvailableProduct) {
if (numUserWants <= cheapestAvailableProduct.licences) {
purchases[cheapestAvailableProduct.id] = Math.floor(cheapestAvailableProduct.licences / numUserWants);
numUserWants = cheapestAvailableProduct.licences % numUserWants;
}
cheapestAvailableProduct = myArr.shift();
}
At this point, purchases will now be a map of plan id to number:
purchases => {
3: 3
4: 1
}
This doesn't handle the case where over-purchasing is the cheapest option (eg: it's cheaper to buy 160 at 4x40, instead of 150 at 3x40 + 1x25 + 1x5), but it's probably a good start for you to tweaking.
Just a simple forEach here. Take the number requested, begin calculating/mutating total based on option limits, and once the number requested is less than the option limit you have your final total, which wont be mutated any longer and returned from the function.
function calculateDiscountedTotal(numberRequested, myArr){
var total;
// loop, compare, calculate
myArr.forEach(function(option) {
if(numberRequested >= option.licenses){
total = numberRequested * option.new_value_pr_licence
}
}
if(total != undefined){
return total;
} else {
// user never had enough for initial discount
return "no discount price";
}
}
Sort the array first in terms of number of licenses and then get the object in which number of licenses is less than number of licenses to be bought (just less than the next item in the array which is greater than number of licenses to be bought)
var myArr = [
{
id: 3,
licences: 100
new_value_pr_licence: 40,
},
{
id: 4,
licences: 200,
new_value_pr_licence: 25
},
];
var numOfLic = 150;
myArr.sort( function(a,b){ return a.licences - b.licences } );
var selectedObj = myArr.reduce( function(prev,current){
if ( current.licences > numOfLic )
{
return prev;
}
});
console.log ( "pricing should be " + ( selectedObj.new_value_pr_licence * numOfLic ) );