Find numbers that have specified data about people - javascript

There is an array of binary numbers that are created in this order
the first number means whether the person is on vacation: 1/0
the second set of 8 numbers means the worker's age: 255 maximum (11111111)
the third set of 4 numbers means how many vacations are left: maximum 15 vacations (1111)
How to find:
all worker on vacation, aged 20-30 inclusive
all worker are not on vacation with the number of vacation 10 or more
An array of workers data:
let workers = [
0b1000101001001,
0b1000101111011,
0b1000111101011,
0b0000101101010,
0b0000111011111,
0b0000110011110,
0b1001000011001,
0b1001000011001,
0b0000101101000,
0b0000101100100,
];

To know whether the number i encodes "on vacation" or not, evaluate:
i >= 0x1000
If this is true, that worker is on vacation
To know the age that the number i encodes, evaluate:
(i >> 4) & 0xFF
To know the number of vacation days that is encoded, evaluate:
i & 0xFF
The rest should be simple to do.
all workers on vacation, aged 20-30 inclusive
let workers = [0b1000101001001, 0b1000101111011, 0b1000111101011, 0b0000101101010, 0b0000111011111, 0b0000110011110, 0b1001000011001, 0b1001000011001, 0b0000101101000, 0b0000101100100, ];
let result = workers.filter(i => {
let onVac = i >= 0x1000;
let age = (i >> 4) & 0xFF;
return onVac && age >= 20 && age <= 30;
});
console.log(result.map(num => num.toString(2)))
all workers that are not on vacation with the number of vacation 10 or more
let workers = [0b1000101001001, 0b1000101111011, 0b1000111101011, 0b0000101101010, 0b0000111011111, 0b0000110011110, 0b1001000011001, 0b1001000011001, 0b0000101101000, 0b0000101100100, ];
let result = workers.filter(i => {
let onVac = i >= 0x1000;
let numVac = i & 0xFF;
return !onVac && numVac >= 10;
});
console.log(result.map(num=>num.toString(2)))
Sometimes you can filter with fewer operations, but this at least shows clearly the final expression that is evaluated in the filter callback.
Other patterns
In general, to extract a part of a bitpattern, you first shift the input and then AND it. Here the part to extract is indicated with ^:
input = 0b1111111111111111111111111
^^^^^^^^
← 8 →← 7 →
First count the number of bits (digits) that are at the right of the section you need. In this case there are 7 bits to the right of the ^-marked section. This determines how many times you need to shift with the >> operator.
Then you count the number of bits that are in the section of interest. In this case the ^ section has a width of 8 bits. Now make a number that has 8 bits, and all of them 1-bits: 0b11111111. You can of course choose to write it in hexadecimal or in decimal, ... it doesn't matter, as long as the value is the same. This number will be the number you perform the AND (&) operation with.
So in the above example, you combine as follows:
(input >> 7) & 0b11111111
There are of course a few cases where you can shorten this expression:
Extract left most bits
If the section you need, is the left-most section in the bit pattern, then the AND-operation is not needed:
input = 0b1111111111111111111111111
^^^^^^^^
← 8 →← 17 →
Here both of the following give the same result:
(input >> 17) & 0b11111111
input >> 17
Extract right most bits
If the section you need, is the right-most section in the bit pattern, then the AND-operation is not needed:
input = 0b1111111111111111111111111
^^^^^^^^
← 8 →
Here both of the following give the same result:
(input >> 0) & 0b11111111
input & 0b11111111
Extract left most bit -- just one bit
If the section you need, is the left-most single bit in the bit pattern, then you can also use a greater-or-equal comparison to return a boolean:
input = 0b1111111111111111111111111
^
← 24 →
You can know whether this bit it set with a comparison. Here all of the following give a useful same result:
(input >> 24) & 0b1
input >> 24
input >= 0b1000000000000000000000000
...but with one difference: the third evaluates to a boolean, while the other two evaluate to a 0 or a 1.

Decrypt:
function decrypt(bin, encriptDigitsArray) {
var result=[];
while(bin!="" && encriptDigitsArray.length) {
result.push(parseInt(bin.slice(0,encriptDigitsArray[0]), 2));
bin=bin.slice(encriptDigitsArray[0]);
encriptDigitsArray.shift();
}
return result;
}
console.log(decrypt("0001111011011",[8,4,1]));
console.log(decrypt("000111101101",[8,4,1]));
console.log(decrypt("00011110",[8,4,1]));
.as-console-wrapper { max-height: 100% !important; top: 0; }
Answered here for you other question here.

Related

Bitwise OR ( | ) and Bitwise Operator ( ~ ) - Javascript

The challenge: Given integers N and K (where 2 <= K <= N), find the highest value of A & B (i.e. A bitwise-AND B) such that 0 <= A < B <= N and (A & B) < K.
I've made my code and it runs perfect with visible cases but it fails with the hidden cases.
My Code:
function bitwiseAnd(N, K) {
// Write your code here
let arr = []
N = parseInt(N,10)
K = parseInt(K,10)
for(let a=1; a<N; a++){
for(let b=a+1; b<=N ; b++){
if(K>=0 && parseInt(a&b,10) < K && parseInt(a&b,10)>=0) arr.push(parseInt(a&b,10))
}
}
return Math.max(...arr)
}
i searched for a solution and i found this one.
Solution:
function bitwiseAnd(N, K) {
// Write your code here
var n = parseInt(N);
var k = parseInt(K);
var a = k - 1;
var b = (~a) & -(~a);
if ( (a | b) > n )
return(a - 1);
else
return(a);
}
However the solution's code is working good with all hidden test, i cannot understand how it works, as i'm not familiar with Bitwise operators.
is there any chance to get an explanation of this operators and how to use ?
Thanks in advance.
Here is a variant of your code that works:
function bitwiseAnd(N, K) {
let r = 0;
for(let a=1; a<N; a++) {
for(let b=a+1; b<=N; b++) {
if((a&b) < K) r = Math.max(r, a&b)
}
}
return r
}
console.log(bitwiseAnd(5,2)) // 1
console.log(bitwiseAnd(8,5)) // 4
console.log(bitwiseAnd(2,2)) // 0
I've removed unnecessary parseInts and result checks. However, the only thing that was actually wrong with your code was the line Math.max(...arr).
The problem with that line is that it is not passing a reference to arr to the Math.max method. Instead, it is actually including all members of arr as arguments which will appear on the Javascript call stack. Normally, this would be fine. However, if you have hundreds of thousands of values in the array, the maximum call stack size will be exceeded and your code will fail to execute. This is why a hidden test case, which must have resulted in huge numbers of array members, caused hackerrank to reject your solution.
You've also posted a very interesting solution which does not use loops at all.
I've cleaned it up a little, and included it below:
function bitwiseAnd(n, k) {
let a = k-1
return (a | ~a & a+1) <= n ? a : a-1
}
console.log(bitwiseAnd(5,2)) // 1
console.log(bitwiseAnd(8,5)) // 4
console.log(bitwiseAnd(2,2)) // 0
Note that the original code said & -(~a) instead of & a+1. The Two's complement method of negating a number is to flip all bits and add one to the result. Since the bitwise not ~ also means to flip all bits, the original code flips all bits, flips all bits again, and then adds one. This is therefore an obfuscation of simply doing a+1.
Here is an explanation of the thinking that must be behind that code:
Since the result of A & B must be less than K, the maximum number that might be the result will be K-1.
When we do the bitwise A & B, the result can only switch certain bits off, and cannot switch any bits on. Since A must be a smaller number than B, and since we can only switch bits of A off, the result of A & B thus cannot exceed the value of A.
Combining the two conclusions above, we can conclude:
The choice of A cannot exceed K-1 if our choice of B will preserve all bits of A.
So, let's see if we can set A to K-1 and find a value of B which will preserve all bits in A.
The first value greater than A that will preserve all bits of A will be the result of setting the least significant bit that is currently unset.
For example, if A = 101, the next value of B that will preserve all bits of A after doing A & B is 111.
In order to locate the position of the least significant unset bit, we can do ~A & A+1. For the example where A is 101, ~A & A+1 is 010. Essentially, ~A & A+1, no matter what the value of A, will always return a binary sequence containing exactly one set bit.
The reason this works is that ~A acts as a mask that will only allow the result of ~A & A+1 to contain bits which were unset in A. When we do A+1, the first bit that will be set for the first time in A will be the least significant unset bit, and no other bits will be set for the first time.
For example: if A = 101, ~A = 010, A+1 = 110, and so ~A & A+1 = 010 & 110 = 010. That 1 in the answer is the position of the least significant unset bit in A.
If we do a bitwise OR, we can use this result to calculate the first higher value of B which will preserve all bits of A. Thus B = A | ~A & A+1. Therefore, we can see in the runnable code snippet above that if we find that (a | ~a & a+1) <= n, we know that the value of a = k-1 will work and we can return this as the answer.
But, what if that value of B that exceeds N? We will need to find the next-best answer.
Since we tried A = K-1, and since the question states K <= N, this means A must have been an odd number (i.e. the least significant bit was set). This is because if A was even, we'd have attempted the value B = A + 1, and B could not have exceeded N since K <= N and A = K-1.
So, since our earlier attempt failed when we tried A = K-1, we know we can't go higher than A to find a value of B that works when A = K-1.
But, if we set A = K-2, and since we know K-1 was odd, this means K-2 is even. This means that we can have A = K-2 and B = K-1, and A & B = K-2. The result of the function is therefore K-2.
The code could therefore perhaps be more clearly written as:
function bitwiseAnd(n, k) {
let a = k-1
return (a | ~a & a+1) <= n ? k-1 : k-2
}
console.log(bitwiseAnd(5,2)) // 1
console.log(bitwiseAnd(8,5)) // 4
console.log(bitwiseAnd(2,2)) // 0
The bottom line is that we don't need any loops, because there is always an answer where A = K-1 and B is more than one higher than A, or where A = K-2 and B = K-1.
But, there is an improvement we can make...
So far, we've explained the thinking behind the solution you posted. That solution, however, is more complicated than it needs to be.
If we want to take A and set the least-significant unset bit, we can simply do A | A+1.
This means a simpler solution would be:
function bitwiseAnd(n, k) {
let a = k-1
return (a | a+1) <= n ? k-1 : k-2
}
console.log(bitwiseAnd(5,2)) // 1
console.log(bitwiseAnd(8,5)) // 4
console.log(bitwiseAnd(2,2)) // 0
And, if we really want to code-golf the solution, we can substitute out the variable a, and take advantage of Javascript coercing true/false to 1/0 and do:
function bitwiseAnd(n, k) {
return --k-((k++|k)>n)
}
console.log(bitwiseAnd(5,2)) // 1
console.log(bitwiseAnd(8,5)) // 4
console.log(bitwiseAnd(2,2)) // 0

is nanoid's random algorithm really better then random % alphabet? [duplicate]

I have seen this question asked a lot but never seen a true concrete answer to it. So I am going to post one here which will hopefully help people understand why exactly there is "modulo bias" when using a random number generator, like rand() in C++.
So rand() is a pseudo-random number generator which chooses a natural number between 0 and RAND_MAX, which is a constant defined in cstdlib (see this article for a general overview on rand()).
Now what happens if you want to generate a random number between say 0 and 2? For the sake of explanation, let's say RAND_MAX is 10 and I decide to generate a random number between 0 and 2 by calling rand()%3. However, rand()%3 does not produce the numbers between 0 and 2 with equal probability!
When rand() returns 0, 3, 6, or 9, rand()%3 == 0. Therefore, P(0) = 4/11
When rand() returns 1, 4, 7, or 10, rand()%3 == 1. Therefore, P(1) = 4/11
When rand() returns 2, 5, or 8, rand()%3 == 2. Therefore, P(2) = 3/11
This does not generate the numbers between 0 and 2 with equal probability. Of course for small ranges this might not be the biggest issue but for a larger range this could skew the distribution, biasing the smaller numbers.
So when does rand()%n return a range of numbers from 0 to n-1 with equal probability? When RAND_MAX%n == n - 1. In this case, along with our earlier assumption rand() does return a number between 0 and RAND_MAX with equal probability, the modulo classes of n would also be equally distributed.
So how do we solve this problem? A crude way is to keep generating random numbers until you get a number in your desired range:
int x;
do {
x = rand();
} while (x >= n);
but that's inefficient for low values of n, since you only have a n/RAND_MAX chance of getting a value in your range, and so you'll need to perform RAND_MAX/n calls to rand() on average.
A more efficient formula approach would be to take some large range with a length divisible by n, like RAND_MAX - RAND_MAX % n, keep generating random numbers until you get one that lies in the range, and then take the modulus:
int x;
do {
x = rand();
} while (x >= (RAND_MAX - RAND_MAX % n));
x %= n;
For small values of n, this will rarely require more than one call to rand().
Works cited and further reading:
CPlusPlus Reference
Eternally Confuzzled
Keep selecting a random is a good way to remove the bias.
Update
We could make the code fast if we search for an x in range divisible by n.
// Assumptions
// rand() in [0, RAND_MAX]
// n in (0, RAND_MAX]
int x;
// Keep searching for an x in a range divisible by n
do {
x = rand();
} while (x >= RAND_MAX - (RAND_MAX % n))
x %= n;
The above loop should be very fast, say 1 iteration on average.
#user1413793 is correct about the problem. I'm not going to discuss that further, except to make one point: yes, for small values of n and large values of RAND_MAX, the modulo bias can be very small. But using a bias-inducing pattern means that you must consider the bias every time you calculate a random number and choose different patterns for different cases. And if you make the wrong choice, the bugs it introduces are subtle and almost impossible to unit test. Compared to just using the proper tool (such as arc4random_uniform), that's extra work, not less work. Doing more work and getting a worse solution is terrible engineering, especially when doing it right every time is easy on most platforms.
Unfortunately, the implementations of the solution are all incorrect or less efficient than they should be. (Each solution has various comments explaining the problems, but none of the solutions have been fixed to address them.) This is likely to confuse the casual answer-seeker, so I'm providing a known-good implementation here.
Again, the best solution is just to use arc4random_uniform on platforms that provide it, or a similar ranged solution for your platform (such as Random.nextInt on Java). It will do the right thing at no code cost to you. This is almost always the correct call to make.
If you don't have arc4random_uniform, then you can use the power of opensource to see exactly how it is implemented on top of a wider-range RNG (ar4random in this case, but a similar approach could also work on top of other RNGs).
Here is the OpenBSD implementation:
/*
* Calculate a uniformly distributed random number less than upper_bound
* avoiding "modulo bias".
*
* Uniformity is achieved by generating new random numbers until the one
* returned is outside the range [0, 2**32 % upper_bound). This
* guarantees the selected random number will be inside
* [2**32 % upper_bound, 2**32) which maps back to [0, upper_bound)
* after reduction modulo upper_bound.
*/
u_int32_t
arc4random_uniform(u_int32_t upper_bound)
{
u_int32_t r, min;
if (upper_bound < 2)
return 0;
/* 2**32 % x == (2**32 - x) % x */
min = -upper_bound % upper_bound;
/*
* This could theoretically loop forever but each retry has
* p > 0.5 (worst case, usually far better) of selecting a
* number inside the range we need, so it should rarely need
* to re-roll.
*/
for (;;) {
r = arc4random();
if (r >= min)
break;
}
return r % upper_bound;
}
It is worth noting the latest commit comment on this code for those who need to implement similar things:
Change arc4random_uniform() to calculate 2**32 % upper_bound as
-upper_bound % upper_bound. Simplifies the code and makes it the
same on both ILP32 and LP64 architectures, and also slightly faster on
LP64 architectures by using a 32-bit remainder instead of a 64-bit
remainder.
Pointed out by Jorden Verwer on tech#
ok deraadt; no objections from djm or otto
The Java implementation is also easily findable (see previous link):
public int nextInt(int n) {
if (n <= 0)
throw new IllegalArgumentException("n must be positive");
if ((n & -n) == n) // i.e., n is a power of 2
return (int)((n * (long)next(31)) >> 31);
int bits, val;
do {
bits = next(31);
val = bits % n;
} while (bits - val + (n-1) < 0);
return val;
}
Definition
Modulo Bias is the inherent bias in using modulo arithmetic to reduce an output set to a subset of the input set. In general, a bias exists whenever the mapping between the input and output set is not equally distributed, as in the case of using modulo arithmetic when the size of the output set is not a divisor of the size of the input set.
This bias is particularly hard to avoid in computing, where numbers are represented as strings of bits: 0s and 1s. Finding truly random sources of randomness is also extremely difficult, but is beyond the scope of this discussion. For the remainder of this answer, assume that there exists an unlimited source of truly random bits.
Problem Example
Let's consider simulating a die roll (0 to 5) using these random bits. There are 6 possibilities, so we need enough bits to represent the number 6, which is 3 bits. Unfortunately, 3 random bits yields 8 possible outcomes:
000 = 0, 001 = 1, 010 = 2, 011 = 3
100 = 4, 101 = 5, 110 = 6, 111 = 7
We can reduce the size of the outcome set to exactly 6 by taking the value modulo 6, however this presents the modulo bias problem: 110 yields a 0, and 111 yields a 1. This die is loaded.
Potential Solutions
Approach 0:
Rather than rely on random bits, in theory one could hire a small army to roll dice all day and record the results in a database, and then use each result only once. This is about as practical as it sounds, and more than likely would not yield truly random results anyway (pun intended).
Approach 1:
Instead of using the modulus, a naive but mathematically correct solution is to discard results that yield 110 and 111 and simply try again with 3 new bits. Unfortunately, this means that there is a 25% chance on each roll that a re-roll will be required, including each of the re-rolls themselves. This is clearly impractical for all but the most trivial of uses.
Approach 2:
Use more bits: instead of 3 bits, use 4. This yield 16 possible outcomes. Of course, re-rolling anytime the result is greater than 5 makes things worse (10/16 = 62.5%) so that alone won't help.
Notice that 2 * 6 = 12 < 16, so we can safely take any outcome less than 12 and reduce that modulo 6 to evenly distribute the outcomes. The other 4 outcomes must be discarded, and then re-rolled as in the previous approach.
Sounds good at first, but let's check the math:
4 discarded results / 16 possibilities = 25%
In this case, 1 extra bit didn't help at all!
That result is unfortunate, but let's try again with 5 bits:
32 % 6 = 2 discarded results; and
2 discarded results / 32 possibilities = 6.25%
A definite improvement, but not good enough in many practical cases. The good news is, adding more bits will never increase the chances of needing to discard and re-roll. This holds not just for dice, but in all cases.
As demonstrated however, adding an 1 extra bit may not change anything. In fact if we increase our roll to 6 bits, the probability remains 6.25%.
This begs 2 additional questions:
If we add enough bits, is there a guarantee that the probability of a discard will diminish?
How many bits are enough in the general case?
General Solution
Thankfully the answer to the first question is yes. The problem with 6 is that 2^x mod 6 flips between 2 and 4 which coincidentally are a multiple of 2 from each other, so that for an even x > 1,
[2^x mod 6] / 2^x == [2^(x+1) mod 6] / 2^(x+1)
Thus 6 is an exception rather than the rule. It is possible to find larger moduli that yield consecutive powers of 2 in the same way, but eventually this must wrap around, and the probability of a discard will be reduced.
Without offering further proof, in general using double the number
of bits required will provide a smaller, usually insignificant,
chance of a discard.
Proof of Concept
Here is an example program that uses OpenSSL's libcrypo to supply random bytes. When compiling, be sure to link to the library with -lcrypto which most everyone should have available.
#include <iostream>
#include <assert.h>
#include <limits>
#include <openssl/rand.h>
volatile uint32_t dummy;
uint64_t discardCount;
uint32_t uniformRandomUint32(uint32_t upperBound)
{
assert(RAND_status() == 1);
uint64_t discard = (std::numeric_limits<uint64_t>::max() - upperBound) % upperBound;
RAND_bytes((uint8_t*)(&randomPool), sizeof(randomPool));
while(randomPool > (std::numeric_limits<uint64_t>::max() - discard)) {
RAND_bytes((uint8_t*)(&randomPool), sizeof(randomPool));
++discardCount;
}
return randomPool % upperBound;
}
int main() {
discardCount = 0;
const uint32_t MODULUS = (1ul << 31)-1;
const uint32_t ROLLS = 10000000;
for(uint32_t i = 0; i < ROLLS; ++i) {
dummy = uniformRandomUint32(MODULUS);
}
std::cout << "Discard count = " << discardCount << std::endl;
}
I encourage playing with the MODULUS and ROLLS values to see how many re-rolls actually happen under most conditions. A sceptical person may also wish to save the computed values to file and verify the distribution appears normal.
Mark's Solution (The accepted solution) is Nearly Perfect.
int x;
do {
x = rand();
} while (x >= (RAND_MAX - RAND_MAX % n));
x %= n;
edited Mar 25 '16 at 23:16
Mark Amery 39k21170211
However, it has a caveat which discards 1 valid set of outcomes in any scenario where RAND_MAX (RM) is 1 less than a multiple of N (Where N = the Number of possible valid outcomes).
ie, When the 'count of values discarded' (D) is equal to N, then they are actually a valid set (V), not an invalid set (I).
What causes this is at some point Mark loses sight of the difference between N and Rand_Max.
N is a set who's valid members are comprised only of Positive Integers, as it contains a count of responses that would be valid. (eg: Set N = {1, 2, 3, ... n } )
Rand_max However is a set which ( as defined for our purposes ) includes any number of non-negative integers.
In it's most generic form, what is defined here as Rand Max is the Set of all valid outcomes, which could theoretically include negative numbers or non-numeric values.
Therefore Rand_Max is better defined as the set of "Possible Responses".
However N operates against the count of the values within the set of valid responses, so even as defined in our specific case, Rand_Max will be a value one less than the total number it contains.
Using Mark's Solution, Values are Discarded when: X => RM - RM % N
EG:
Ran Max Value (RM) = 255
Valid Outcome (N) = 4
When X => 252, Discarded values for X are: 252, 253, 254, 255
So, if Random Value Selected (X) = {252, 253, 254, 255}
Number of discarded Values (I) = RM % N + 1 == N
IE:
I = RM % N + 1
I = 255 % 4 + 1
I = 3 + 1
I = 4
X => ( RM - RM % N )
255 => (255 - 255 % 4)
255 => (255 - 3)
255 => (252)
Discard Returns $True
As you can see in the example above, when the value of X (the random number we get from the initial function) is 252, 253, 254, or 255 we would discard it even though these four values comprise a valid set of returned values.
IE: When the count of the values Discarded (I) = N (The number of valid outcomes) then a Valid set of return values will be discarded by the original function.
If we describe the difference between the values N and RM as D, ie:
D = (RM - N)
Then as the value of D becomes smaller, the Percentage of unneeded re-rolls due to this method increases at each natural multiplicative. (When RAND_MAX is NOT equal to a Prime Number this is of valid concern)
EG:
RM=255 , N=2 Then: D = 253, Lost percentage = 0.78125%
RM=255 , N=4 Then: D = 251, Lost percentage = 1.5625%
RM=255 , N=8 Then: D = 247, Lost percentage = 3.125%
RM=255 , N=16 Then: D = 239, Lost percentage = 6.25%
RM=255 , N=32 Then: D = 223, Lost percentage = 12.5%
RM=255 , N=64 Then: D = 191, Lost percentage = 25%
RM=255 , N= 128 Then D = 127, Lost percentage = 50%
Since the percentage of Rerolls needed increases the closer N comes to RM, this can be of valid concern at many different values depending on the constraints of the system running he code and the values being looked for.
To negate this we can make a simple amendment As shown here:
int x;
do {
x = rand();
} while (x > (RAND_MAX - ( ( ( RAND_MAX % n ) + 1 ) % n) );
x %= n;
This provides a more general version of the formula which accounts for the additional peculiarities of using modulus to define your max values.
Examples of using a small value for RAND_MAX which is a multiplicative of N.
Mark'original Version:
RAND_MAX = 3, n = 2, Values in RAND_MAX = 0,1,2,3, Valid Sets = 0,1 and 2,3.
When X >= (RAND_MAX - ( RAND_MAX % n ) )
When X >= 2 the value will be discarded, even though the set is valid.
Generalized Version 1:
RAND_MAX = 3, n = 2, Values in RAND_MAX = 0,1,2,3, Valid Sets = 0,1 and 2,3.
When X > (RAND_MAX - ( ( RAND_MAX % n ) + 1 ) % n )
When X > 3 the value would be discarded, but this is not a vlue in the set RAND_MAX so there will be no discard.
Additionally, in the case where N should be the number of values in RAND_MAX; in this case, you could set N = RAND_MAX +1, unless RAND_MAX = INT_MAX.
Loop-wise you could just use N = 1, and any value of X will be accepted, however, and put an IF statement in for your final multiplier. But perhaps you have code that may have a valid reason to return a 1 when the function is called with n = 1...
So it may be better to use 0, which would normally provide a Div 0 Error, when you wish to have n = RAND_MAX+1
Generalized Version 2:
int x;
if n != 0 {
do {
x = rand();
} while (x > (RAND_MAX - ( ( ( RAND_MAX % n ) + 1 ) % n) );
x %= n;
} else {
x = rand();
}
Both of these solutions resolve the issue with needlessly discarded valid results which will occur when RM+1 is a product of n.
The second version also covers the edge case scenario when you need n to equal the total possible set of values contained in RAND_MAX.
The modified approach in both is the same and allows for a more general solution to the need of providing valid random numbers and minimizing discarded values.
To reiterate:
The Basic General Solution which extends mark's example:
// Assumes:
// RAND_MAX is a globally defined constant, returned from the environment.
// int n; // User input, or externally defined, number of valid choices.
int x;
do {
x = rand();
} while (x > (RAND_MAX - ( ( ( RAND_MAX % n ) + 1 ) % n) ) );
x %= n;
The Extended General Solution which Allows one additional scenario of RAND_MAX+1 = n:
// Assumes:
// RAND_MAX is a globally defined constant, returned from the environment.
// int n; // User input, or externally defined, number of valid choices.
int x;
if n != 0 {
do {
x = rand();
} while (x > (RAND_MAX - ( ( ( RAND_MAX % n ) + 1 ) % n) ) );
x %= n;
} else {
x = rand();
}
In some languages ( particularly interpreted languages ) doing the calculations of the compare-operation outside of the while condition may lead to faster results as this is a one-time calculation no matter how many re-tries are required. YMMV!
// Assumes:
// RAND_MAX is a globally defined constant, returned from the environment.
// int n; // User input, or externally defined, number of valid choices.
int x; // Resulting random number
int y; // One-time calculation of the compare value for x
y = RAND_MAX - ( ( ( RAND_MAX % n ) + 1 ) % n)
if n != 0 {
do {
x = rand();
} while (x > y);
x %= n;
} else {
x = rand();
}
There are two usual complaints with the use of modulo.
one is valid for all generators. It is easier to see in a limit case. If your generator has a RAND_MAX which is 2 (that isn't compliant with the C standard) and you want only 0 or 1 as value, using modulo will generate 0 twice as often (when the generator generates 0 and 2) as it will generate 1 (when the generator generates 1). Note that this is true as soon as you don't drop values, whatever the mapping you are using from the generator values to the wanted one, one will occurs twice as often as the other.
some kind of generator have their less significant bits less random than the other, at least for some of their parameters, but sadly those parameter have other interesting characteristic (such has being able to have RAND_MAX one less than a power of 2). The problem is well known and for a long time library implementation probably avoid the problem (for instance the sample rand() implementation in the C standard use this kind of generator, but drop the 16 less significant bits), but some like to complain about that and you may have bad luck
Using something like
int alea(int n){
assert (0 < n && n <= RAND_MAX);
int partSize =
n == RAND_MAX ? 1 : 1 + (RAND_MAX-n)/(n+1);
int maxUsefull = partSize * n + (partSize-1);
int draw;
do {
draw = rand();
} while (draw > maxUsefull);
return draw/partSize;
}
to generate a random number between 0 and n will avoid both problems (and it avoids overflow with RAND_MAX == INT_MAX)
BTW, C++11 introduced standard ways to the the reduction and other generator than rand().
With a RAND_MAX value of 3 (in reality it should be much higher than that but the bias would still exist) it makes sense from these calculations that there is a bias:
1 % 2 = 1
2 % 2 = 0
3 % 2 = 1
random_between(1, 3) % 2 = more likely a 1
In this case, the % 2 is what you shouldn't do when you want a random number between 0 and 1. You could get a random number between 0 and 2 by doing % 3 though, because in this case: RAND_MAX is a multiple of 3.
Another method
There is much simpler but to add to other answers, here is my solution to get a random number between 0 and n - 1, so n different possibilities, without bias.
the number of bits (not bytes) needed to encode the number of possibilities is the number of bits of random data you'll need
encode the number from random bits
if this number is >= n, restart (no modulo).
Really random data is not easy to obtain, so why use more bits than needed.
Below is an example in Smalltalk, using a cache of bits from a pseudo-random number generator. I'm no security expert so use at your own risk.
next: n
| bitSize r from to |
n < 0 ifTrue: [^0 - (self next: 0 - n)].
n = 0 ifTrue: [^nil].
n = 1 ifTrue: [^0].
cache isNil ifTrue: [cache := OrderedCollection new].
cache size < (self randmax highBit) ifTrue: [
Security.DSSRandom default next asByteArray do: [ :byte |
(1 to: 8) do: [ :i | cache add: (byte bitAt: i)]
]
].
r := 0.
bitSize := n highBit.
to := cache size.
from := to - bitSize + 1.
(from to: to) do: [ :i |
r := r bitAt: i - from + 1 put: (cache at: i)
].
cache removeFrom: from to: to.
r >= n ifTrue: [^self next: n].
^r
Modulo reduction is a commonly seen way to make a random integer generator avoid the worst case of running forever.
When the range of possible integers is unknown, however, there is no way in general to "fix" this worst case of running forever without introducing bias. It's not just modulo reduction (rand() % n, discussed in the accepted answer) that will introduce bias this way, but also the "multiply-and-shift" reduction of Daniel Lemire, or if you stop rejecting an outcome after a set number of iterations. (To be clear, this doesn't mean there is no way to fix the bias issues present in pseudorandom generators. For example, even though modulo and other reductions are biased in general, they will have no issues with bias if the range of possible integers is a power of 2 and if the random generator produces unbiased random bits or blocks of them.)
The following answer of mine discusses the relationship between running time and bias in random generators, assuming we have a "true" random generator that can produce unbiased and independent random bits. The answer doesn't even involve the rand() function in C because it has many issues. Perhaps the most serious here is the fact that the C standard does not explicitly specify a particular distribution for the numbers returned by rand(), not even a uniform distribution.
How to generate a random integer in the range [0,n] from a stream of random bits without wasting bits?
As the accepted answer indicates, "modulo bias" has its roots in the low value of RAND_MAX. He uses an extremely small value of RAND_MAX (10) to show that if RAND_MAX were 10, then you tried to generate a number between 0 and 2 using %, the following outcomes would result:
rand() % 3 // if RAND_MAX were only 10, gives
output of rand() | rand()%3
0 | 0
1 | 1
2 | 2
3 | 0
4 | 1
5 | 2
6 | 0
7 | 1
8 | 2
9 | 0
So there are 4 outputs of 0's (4/10 chance) and only 3 outputs of 1 and 2 (3/10 chances each).
So it's biased. The lower numbers have a better chance of coming out.
But that only shows up so obviously when RAND_MAX is small. Or more specifically, when the number your are modding by is large compared to RAND_MAX.
A much better solution than looping (which is insanely inefficient and shouldn't even be suggested) is to use a PRNG with a much larger output range. The Mersenne Twister algorithm has a maximum output of 4,294,967,295. As such doing MersenneTwister::genrand_int32() % 10 for all intents and purposes, will be equally distributed and the modulo bias effect will all but disappear.
I just wrote a code for Von Neumann's Unbiased Coin Flip Method, that should theoretically eliminate any bias in the random number generation process. More info can be found at (http://en.wikipedia.org/wiki/Fair_coin)
int unbiased_random_bit() {
int x1, x2, prev;
prev = 2;
x1 = rand() % 2;
x2 = rand() % 2;
for (;; x1 = rand() % 2, x2 = rand() % 2)
{
if (x1 ^ x2) // 01 -> 1, or 10 -> 0.
{
return x2;
}
else if (x1 & x2)
{
if (!prev) // 0011
return 1;
else
prev = 1; // 1111 -> continue, bias unresolved
}
else
{
if (prev == 1)// 1100
return 0;
else // 0000 -> continue, bias unresolved
prev = 0;
}
}
}

Print all possible strings that can be made by placing spaces

I came across this problem.
Print all possible strings that can be made by placing spaces.
I also came across this solution.
var spacer = function (input) {
var result = '' ;
var inputArray = input.split('');
var length = inputArray.length;
var resultSize = Math.pow(2,length-1); // how this works
for(var i = 0 ; i< resultSize ; i++){
for(var j=0;j<length;j++){
result += inputArray[j];
if((i & (1<<j))>0){ // how this works
result += ' ' ;
}
}
result += '\n' ;
}
return result;
}
var main = function() {
var input = 'abcd' ;
var result = spacer(input);
console.log(result);
}
main();
I am not getting how the marked lines work?
Can you clarify on what technique is being used? And what is the basic logic behind this? What are some of other areas where we can use this?
Thanks.
Let's take string abcd as an example.
There are 3 possible places where space can be put:
Between "a" and "b"
Between "b" and "c"
Between "c" and "d"
So generally if length of your string is length then you have length - 1 places for spaces.
Assume that each such place is represented with a separate digit in a binary number. This digit is 0 when we don't put space there, and 1 when we do. E.g.:
a b c d
0 0 0 means that we don't put any spaces - abcd
0 0 1 means that we put space between "c" and "d" only - abc d
0 1 0 means that we put space between "b" and "c" only - ab cd
0 1 1 means ab c d
1 0 0 means a bcd
1 0 1 means a bc d
1 1 0 means a b cd
1 1 1 means a b c d
Converting 000, 001, 010, ..., 111 from binary to decimal will give us values 0, 1, 2, ..., 7.
With 3 places for spaces we have 8 options to put them. It's exactly 2^3 or 2^(length - 1).
Thus we need to iterate all numbers between 0 (inclusive) and 2^(length - 1) (exclusive).
Condition (i & (1 << j)) > 0 in the code you provided just checks whether digit at position j (starting from the end, 0-based) is 0 (and we don't need to insert space) or 1 (and space should be added).
Let's take value 6 (110 in binary) for example.
(6 & (1 << 0)) = 110 & 001 = 000 = 0 (condition > 0 is not met)
(6 & (1 << 1)) = 110 & 010 = 010 = 2 (condition > 0 is met)
(6 & (1 << 2)) = 110 & 100 = 100 = 4 (condition > 0 is met)
simple solution Using Javascript
String.prototype.splice = function(idx, rem, str) {
return this.slice(0, idx) + str + this.slice(idx + Math.abs(rem));
};
function printPattern(str, i, n){
if(i==n){
return;
}
var buff = str;
var j = str.length - n + i;
buff = str.splice(j,0," ");
console.log(buff);
printPattern(str, i+1, n);
printPattern(buff, i+1, n);
}
var str = "ABCD"
printPattern(str, 1, str.length);
console.log(str);
There are 2 possibilities between each 2 characters: either has space or not. If spaces are allowed only between characters then the number of possibilities for 4 characters is 2 * 2 * 2 or 2 ^ (length - 1)
resultSize = Math.pow(2, length - 1) says there are 2^n-1 possible ways to print a string given the problem definition. As far as to why that is the number of solutions is pretty easy to understand if you start with a string that has 2 characters and work your way upward. So pretend you have the string "ab". There is two solutions, you can put a space between a and b or you can not put a space between a and b. Lets add a character to get "abc". Well, we already know that there are two solutions for the string "ab" so we need multiply that solution by the number of ways you can put a space between b and c. Which is 2 so that gives us 2 * 2 solutions. You can extend that to a string of n size. So that means that the number of solutions is 2 * 2 * 2 * 2 * ... n - 1. Or in otherwords 2 ^ n-1.
The next part is a little trickier, but its a clever way of determining how many spaces and where they go in any given solution. The bitwise & takes the each bit of two numbers, compares them, then spits out a new number where each bit is a 1 if both the bits of the two numbers were 1, or 0 if the bits were not 1 and 1. For example (binary numbers):
01 & 01 = 01
10 & 01 = 00
Or a bigger example:
10010010 & 10100010 = 10000010
The << operator just moves all bits n position to left where n is the right hand expression (multiply by 2 n times).
For example:
1 << 1 = 2
2 << 1 = 4
2 << 2 = 8
1 << 4 = 8
So back to your code now. Lets break down the if statement
if(i & (1 << j) > 0){ ... }
In english this says, if the number index of the solution we are looking at shares any 1 bits with 1 shifted by the index of the character we are looking at, then put a space after that character. Olexiy Sadovnikov's answer has some good examples of what this would look like for some of the iterations.
Its also worth noting that this is not the only way to do this. You could pretty easily determine that the max number of spaces is n - 1 then just linearly find all of the solutions that have 0 spaces in them, then find all the solutions that have 1 space in them, then 2 spaces .... to n - 1 spaces. Though the solution you posted would be faster than doing it this way. Of course when your talking about algorithms of exponential complexity it ultimately won't matter because strings bigger than about 60 chars will take longer than you probably care to wait for even with a strictly 2^n algorithm.
In answer to the question of how these techniques can be used in other places. Bit shifting is used very frequently in encryption algorithms as well as the bitwise & and | operators.
'''
The idea was to fix each character from the beginning and print space separated rest of the string.
Like for "ABCD":
A BCD # Fix A and print rest string
AB CD # Add B to previous value A and print rest of the string
ABC D # Add C to previous value AB and print rest of the string
Similarly we can add a space to produce all permutations.
Like:
In second step above we got "AB CD" by having "A" as prefix
So now we can get "A B CD" by having "A " as a prefix
'''
def printPermute(arr, s, app):
if len(arr) <= 1:
return
else:
print(app +''+arr[0:s] +' '+ arr[s:len(arr)])
prefix = app + ''+arr[0:s]
suffix = arr[s:len(arr)]
printPermute(suffix, 1, prefix)
printPermute(suffix, 1, prefix+' ') #Appending space
printPermute("ABCDE", 1, '') #Empty string
.pow is a method from Math which stands for "power". It takes two arguments: the base (here 2) and the exponent. Read here for more information.
& is the bitwise AND operator, it takes the two binary representation of the numbers and performs a logical AND, here's a thread on bitwise operators
EDIT: why Math.pow(2,length-1) gives us the number of possible strings?
I remember doing it in an exercise last year in math class, but I'll try to explain it without sums.
Essentially we want to determine the number of strings you can make by adding one or no space in between letters. Your initial string has n letters. Here's the reason why:
Starting from the left or the word after each letter you have two choices
1 - to put a space
2 - not to put a space
You will have to choose between the two options exactly n-1 times.
This means you will have a total of 2^(n-1) possible solutions.

What is the correct way to set bit in bitmask?

I'm little bit confuced about a proper way to set bits in bitmask. I have the following function and flags:
var userBmask = 0;
const EMAIL_CONFIRMED = 1;
const EMAIL_UNSUBSCRIBED = 2;
setBit: function (bit) {
userBmask |= 1 << bit; // 10
}
Let's say I want to set bit for email confirmation:
setBit(EMAIL_CONFIRMED);
After the line above my userBmask is: 10. But I'm not sure this is correct because actualy I set second bit instead first. Should I rewrite setBit function to the folowing to set bits from most right bit?:
setBit: function (bit) {
userBmask |= 1 << bit - 1; // 01
}
Now after setBit(EMAIL_CONFIRMED) I get result 01
EDIT:
Thanks for your answers. Please look at the following. Becase I'm using javascript that is 32bit I can have bitmask with `0...31 bits. But if I try to set the last available bit (with is 31) I get negative number:
const NEXT_BIT = 31;
setBit(NEXT_BIT); // userBmask now is -10000000000000000000000000000000
Is it expected behavior and could be result of possible bugs?
set your constants to powers of two and use or without bitshift.
const EMAIL_CONFIRMED = 1<<0; //1
const EMAIL_UNSUBSCRIBED = 1<<1; //2
const NEXT_BIT1 = 1<<2; //4
const NEXT_BIT2 = 1<<3; //8
setBit: function (bit) {
userBmask |= bit;
}
unsetBit: function (bit) {
userBmask &= ~bit; //bitwise inverse
}
I think you are assuming that the lowest value bit is "bit one", this is actually not correct.
Consider "powers of two" which is the basis of binary numbers...
2^0 == 1 //first bit is "bit zero"
2^1 == 2 //second bit is "bit one"
2^2 == 4 //third bit is "bit two"
2^3 == 8 //fourth bit is "bit three"
//and so on
If you shift 1 left by 1 then you go from 2^0 to 2^1. So binary 10 is decimal 2.
If you want to set the lowest bit in userBmask when you pass 1 to your function then yes you'll have to subtract 1 from bit before you do the shift.

How can I check that a bit is set (without bitwise operation)?

Looking at the int 44 — I need Math.CEIL (log(2) 44) of binary places to represent 44.
(answer is 6 places)
6 places :
___ ___ ___ ___ ___ ___
32 16 8 4 2 1
But how can I check that (for example) the bit of 8 is checked or not ?
A simple solution will be do to :
((1<<3) & 44)>0 so this will check if the bit is set.
But please notice that behind the scenes the computer translates 44 to its binary representation and just check if bit is set via bitwise operation.
Another solution is just to build the binary myself via toString(2) or mod%2 in a loop
Question
Mathematically Via which formula, I can test if n'th bit is set ?
(I would prefer a non loop operation but pure single math phrase)
Divide by the value of the bit that you want to check
and test if the first bit is set (this can be tested with x mod 2 == 1)
Math expression:
floor(value/(2^bitPos)) mod 2 = 1
As JS function:
function isSet(value, bitPos) {
var result = Math.floor(value / Math.pow(2, bitPos)) % 2;
return result == 1;
}
Note: bitPos starts with 0 (bit representing the nr 1)
The 'bit' (actually any base) value of an indexed number index in a value val in base base can in general be calculated as
val = 1966;
index = 2;
base = 10;
alert (Math.floor(val/Math.pow(base,index)) % base);
result: 9
val = 44;
index = 3;
base = 2;
alert (Math.floor(val/Math.pow(base,index)) % base);
result: 1 (only 0 and 1 are possible here – the range will always be 0..base-1).
The combination of Math.floor (to coerce to an integer in Javascript) and Math.pow is kind of iffy here. Even in integer range, Math.pow may generate a floating point number slightly below the expected 'whole' number. Perhaps it is safer to always add a small constant:
alert (Math.floor(0.1+val/Math.pow(base,index)) % base);
You can simply check if the bit at the position is set to 1.
function isBitSet(no, index) {
var bin = no.toString(2);
// Convert to Binary
index = bin.length - index;
// Reverse the index, start from right to left
return bin[index] == 1;
}
isBitSet(44, 2); // Check if second bit is set from left
DEMO

Categories

Resources