I'd like to generate unique random numbers between 0 and 1000 that never repeat (i.e. 6 doesn't show up twice), but that doesn't resort to something like an O(N) search of previous values to do it. Is this possible?
Initialize an array of 1001 integers with the values 0-1000 and set a variable, max, to the current max index of the array (starting with 1000). Pick a random number, r, between 0 and max, swap the number at the position r with the number at position max and return the number now at position max. Decrement max by 1 and continue. When max is 0, set max back to the size of the array - 1 and start again without the need to reinitialize the array.
Update:
Although I came up with this method on my own when I answered the question, after some research I realize this is a modified version of Fisher-Yates known as Durstenfeld-Fisher-Yates or Knuth-Fisher-Yates. Since the description may be a little difficult to follow, I have provided an example below (using 11 elements instead of 1001):
Array starts off with 11 elements initialized to array[n] = n, max starts off at 10:
+--+--+--+--+--+--+--+--+--+--+--+
| 0| 1| 2| 3| 4| 5| 6| 7| 8| 9|10|
+--+--+--+--+--+--+--+--+--+--+--+
^
max
At each iteration, a random number r is selected between 0 and max, array[r] and array[max] are swapped, the new array[max] is returned, and max is decremented:
max = 10, r = 3
+--------------------+
v v
+--+--+--+--+--+--+--+--+--+--+--+
| 0| 1| 2|10| 4| 5| 6| 7| 8| 9| 3|
+--+--+--+--+--+--+--+--+--+--+--+
max = 9, r = 7
+-----+
v v
+--+--+--+--+--+--+--+--+--+--+--+
| 0| 1| 2|10| 4| 5| 6| 9| 8| 7: 3|
+--+--+--+--+--+--+--+--+--+--+--+
max = 8, r = 1
+--------------------+
v v
+--+--+--+--+--+--+--+--+--+--+--+
| 0| 8| 2|10| 4| 5| 6| 9| 1: 7| 3|
+--+--+--+--+--+--+--+--+--+--+--+
max = 7, r = 5
+-----+
v v
+--+--+--+--+--+--+--+--+--+--+--+
| 0| 8| 2|10| 4| 9| 6| 5: 1| 7| 3|
+--+--+--+--+--+--+--+--+--+--+--+
...
After 11 iterations, all numbers in the array have been selected, max == 0, and the array elements are shuffled:
+--+--+--+--+--+--+--+--+--+--+--+
| 4|10| 8| 6| 2| 0| 9| 5| 1| 7| 3|
+--+--+--+--+--+--+--+--+--+--+--+
At this point, max can be reset to 10 and the process can continue.
You can do this:
Create a list, 0..1000.
Shuffle the list. (See Fisher-Yates shuffle for a good way to do this.)
Return numbers in order from the shuffled list.
So this doesn't require a search of old values each time, but it still requires O(N) for the initial shuffle. But as Nils pointed out in comments, this is amortised O(1).
Use a Maximal Linear Feedback Shift Register.
It's implementable in a few lines of C and at runtime does little more than a couple test/branches, a little addition and bit shifting. It's not random, but it fools most people.
You could use Format-Preserving Encryption to encrypt a counter. Your counter just goes from 0 upwards, and the encryption uses a key of your choice to turn it into a seemingly random value of whatever radix and width you want. E.g. for the example in this question: radix 10, width 3.
Block ciphers normally have a fixed block size of e.g. 64 or 128 bits. But Format-Preserving Encryption allows you to take a standard cipher like AES and make a smaller-width cipher, of whatever radix and width you want, with an algorithm which is still cryptographically robust.
It is guaranteed to never have collisions (because cryptographic algorithms create a 1:1 mapping). It is also reversible (a 2-way mapping), so you can take the resulting number and get back to the counter value you started with.
This technique doesn't need memory to store a shuffled array etc, which can be an advantage on systems with limited memory.
AES-FFX is one proposed standard method to achieve this. I've experimented with some basic Python code which is based on the AES-FFX idea, although not fully conformant--see Python code here. It can e.g. encrypt a counter to a random-looking 7-digit decimal number, or a 16-bit number. Here is an example of radix 10, width 3 (to give a number between 0 and 999 inclusive) as the question stated:
000 733
001 374
002 882
003 684
004 593
005 578
006 233
007 811
008 072
009 337
010 119
011 103
012 797
013 257
014 932
015 433
... ...
To get different non-repeating pseudo-random sequences, change the encryption key. Each encryption key produces a different non-repeating pseudo-random sequence.
You could use A Linear Congruential Generator. Where m (the modulus) would be the nearest prime bigger than 1000. When you get a number out of the range, just get the next one. The sequence will only repeat once all elements have occurred, and you don't have to use a table. Be aware of the disadvantages of this generator though (including lack of randomness).
For low numbers like 0...1000, creating a list that contains all the numbers and shuffling it is straight forward. But if the set of numbers to draw from is very large there's another elegant way: You can build a pseudorandom permutation using a key and a cryptographic hash function. See the following C++-ish example pseudo code:
unsigned randperm(string key, unsigned bits, unsigned index) {
unsigned half1 = bits / 2;
unsigned half2 = (bits+1) / 2;
unsigned mask1 = (1 << half1) - 1;
unsigned mask2 = (1 << half2) - 1;
for (int round=0; round<5; ++round) {
unsigned temp = (index >> half1);
temp = (temp << 4) + round;
index ^= hash( key + "/" + int2str(temp) ) & mask1;
index = ((index & mask2) << half1) | ((index >> half2) & mask1);
}
return index;
}
Here, hash is just some arbitrary pseudo random function that maps a character string to a possibly huge unsigned integer. The function randperm is a permutation of all numbers within 0...pow(2,bits)-1 assuming a fixed key. This follows from the construction because every step that changes the variable index is reversible. This is inspired by a Feistel cipher.
I think that Linear congruential generator would be the simplest solution.
and there are only 3 restrictions on the a, c and m values
m and c are relatively prime,
a-1 is divisible by all prime factors of m
a-1 is divisible by 4 if m is divisible by 4
PS the method was mentioned already but the post has a wrong assumptions about the constant values. The constants below should work fine for your case
In your case you may use a = 1002, c = 757, m = 1001
X = (1002 * X + 757) mod 1001
You don't even need an array to solve this one.
You need a bitmask and a counter.
Initialize the counter to zero and increment it on successive calls. XOR the counter with the bitmask (randomly selected at startup, or fixed) to generate a psuedorandom number. If you can't have numbers that exceed 1000, don't use a bitmask wider than 9 bits. (In other words, the bitmask is an integer not above 511.)
Make sure that when the counter passes 1000, you reset it to zero. At this time you can select another random bitmask — if you like — to produce the same set of numbers in a different order.
You may use my Xincrol algorithm described here:
http://openpatent.blogspot.co.il/2013/04/xincrol-unique-and-random-number.html
This is a pure algorithmic method of generating random but unique numbers without arrays, lists, permutations or heavy CPU load.
Latest version allows also to set the range of numbers, For example, if I want unique random numbers in range of 0-1073741821.
I've practically used it for
MP3 player which plays every song randomly, but only once per album/directory
Pixel wise video frames dissolving effect (fast and smooth)
Creating a secret "noise" fog over image for signatures and markers (steganography)
Data Object IDs for serialization of huge amount of Java objects via Databases
Triple Majority memory bits protection
Address+value encryption (every byte is not just only encrypted but also moved to a new encrypted location in buffer). This really made cryptanalysis fellows mad on me :-)
Plain Text to Plain Like Crypt Text encryption for SMS, emails etc.
My Texas Hold`em Poker Calculator (THC)
Several of my games for simulations, "shuffling", ranking
more
It is open, free. Give it a try...
Here's some code I typed up that uses the logic of the first solution. I know this is "language agnostic" but just wanted to present this as an example in C# in case anyone is looking for a quick practical solution.
// Initialize variables
Random RandomClass = new Random();
int RandArrayNum;
int MaxNumber = 10;
int LastNumInArray;
int PickedNumInArray;
int[] OrderedArray = new int[MaxNumber]; // Ordered Array - set
int[] ShuffledArray = new int[MaxNumber]; // Shuffled Array - not set
// Populate the Ordered Array
for (int i = 0; i < MaxNumber; i++)
{
OrderedArray[i] = i;
listBox1.Items.Add(OrderedArray[i]);
}
// Execute the Shuffle
for (int i = MaxNumber - 1; i > 0; i--)
{
RandArrayNum = RandomClass.Next(i + 1); // Save random #
ShuffledArray[i] = OrderedArray[RandArrayNum]; // Populting the array in reverse
LastNumInArray = OrderedArray[i]; // Save Last Number in Test array
PickedNumInArray = OrderedArray[RandArrayNum]; // Save Picked Random #
OrderedArray[i] = PickedNumInArray; // The number is now moved to the back end
OrderedArray[RandArrayNum] = LastNumInArray; // The picked number is moved into position
}
for (int i = 0; i < MaxNumber; i++)
{
listBox2.Items.Add(ShuffledArray[i]);
}
This method results appropiate when the limit is high and you only want to generate a few random numbers.
#!/usr/bin/perl
($top, $n) = #ARGV; # generate $n integer numbers in [0, $top)
$last = -1;
for $i (0 .. $n-1) {
$range = $top - $n + $i - $last;
$r = 1 - rand(1.0)**(1 / ($n - $i));
$last += int($r * $range + 1);
print "$last ($r)\n";
}
Note that the numbers are generated in ascending order, but you can shuffle then afterwards.
The question How do you efficiently generate a list of K non-repeating integers between 0 and an upper bound N is linked as a duplicate - and if you want something that is O(1) per generated random number (with no O(n) startup cost)) there is a simple tweak of the accepted answer.
Create an empty unordered map (an empty ordered map will take O(log k) per element) from integer to integer - instead of using an initialized array.
Set max to 1000 if that is the maximum,
Pick a random number, r, between 0 and max.
Ensure that both map elements r and max exist in the unordered map. If they don't exist create them with a value equal to their index.
Swap elements r and max
Return element max and decrement max by 1 (if max goes negative
you are done).
Back to step 1.
The only difference compared with using an initialized array is that the initialization of elements is postponed/skipped - but it will generate the exact same numbers from the same PRNG.
You could use a good pseudo-random number generator with 10 bits and throw away 1001 to 1023 leaving 0 to 1000.
From here we get the design for a 10 bit PRNG..
10 bits, feedback polynomial x^10 + x^7 + 1 (period 1023)
use a Galois LFSR to get fast code
public static int[] randN(int n, int min, int max)
{
if (max <= min)
throw new ArgumentException("Max need to be greater than Min");
if (max - min < n)
throw new ArgumentException("Range needs to be longer than N");
var r = new Random();
HashSet<int> set = new HashSet<int>();
while (set.Count < n)
{
var i = r.Next(max - min) + min;
if (!set.Contains(i))
set.Add(i);
}
return set.ToArray();
}
N Non Repeating random numbers will be of O(n) complexity, as required.
Note: Random should be static with thread safety applied.
Here is some sample COBOL code you can play around with.
I can send you RANDGEN.exe file so you can play with it to see if it does want you want.
IDENTIFICATION DIVISION.
PROGRAM-ID. RANDGEN as "ConsoleApplication2.RANDGEN".
AUTHOR. Myron D Denson.
DATE-COMPILED.
* **************************************************************
* SUBROUTINE TO GENERATE RANDOM NUMBERS THAT ARE GREATER THAN
* ZERO AND LESS OR EQUAL TO THE RANDOM NUMBERS NEEDED WITH NO
* DUPLICATIONS. (CALL "RANDGEN" USING RANDGEN-AREA.)
*
* CALLING PROGRAM MUST HAVE A COMPARABLE LINKAGE SECTION
* AND SET 3 VARIABLES PRIOR TO THE FIRST CALL IN RANDGEN-AREA
*
* FORMULA CYCLES THROUGH EVERY NUMBER OF 2X2 ONLY ONCE.
* RANDOM-NUMBERS FROM 1 TO RANDOM-NUMBERS-NEEDED ARE CREATED
* AND PASSED BACK TO YOU.
*
* RULES TO USE RANDGEN:
*
* RANDOM-NUMBERS-NEEDED > ZERO
*
* COUNT-OF-ACCESSES MUST = ZERO FIRST TIME CALLED.
*
* RANDOM-NUMBER = ZERO, WILL BUILD A SEED FOR YOU
* WHEN COUNT-OF-ACCESSES IS ALSO = 0
*
* RANDOM-NUMBER NOT = ZERO, WILL BE NEXT SEED FOR RANDGEN
* (RANDOM-NUMBER MUST BE <= RANDOM-NUMBERS-NEEDED)
*
* YOU CAN PASS RANDGEN YOUR OWN RANDOM-NUMBER SEED
* THE FIRST TIME YOU USE RANDGEN.
*
* BY PLACING A NUMBER IN RANDOM-NUMBER FIELD
* THAT FOLLOWES THESE SIMPLE RULES:
* IF COUNT-OF-ACCESSES = ZERO AND
* RANDOM-NUMBER > ZERO AND
* RANDOM-NUMBER <= RANDOM-NUMBERS-NEEDED
*
* YOU CAN LET RANDGEN BUILD A SEED FOR YOU
*
* THAT FOLLOWES THESE SIMPLE RULES:
* IF COUNT-OF-ACCESSES = ZERO AND
* RANDOM-NUMBER = ZERO AND
* RANDOM-NUMBER-NEEDED > ZERO
*
* TO INSURING A DIFFERENT PATTERN OF RANDOM NUMBERS
* A LOW-RANGE AND HIGH-RANGE IS USED TO BUILD
* RANDOM NUMBERS.
* COMPUTE LOW-RANGE =
* ((SECONDS * HOURS * MINUTES * MS) / 3).
* A HIGH-RANGE = RANDOM-NUMBERS-NEEDED + LOW-RANGE
* AFTER RANDOM-NUMBER-BUILT IS CREATED
* AND IS BETWEEN LOW AND HIGH RANGE
* RANDUM-NUMBER = RANDOM-NUMBER-BUILT - LOW-RANGE
*
* **************************************************************
ENVIRONMENT DIVISION.
INPUT-OUTPUT SECTION.
FILE-CONTROL.
DATA DIVISION.
FILE SECTION.
WORKING-STORAGE SECTION.
01 WORK-AREA.
05 X2-POWER PIC 9 VALUE 2.
05 2X2 PIC 9(12) VALUE 2 COMP-3.
05 RANDOM-NUMBER-BUILT PIC 9(12) COMP.
05 FIRST-PART PIC 9(12) COMP.
05 WORKING-NUMBER PIC 9(12) COMP.
05 LOW-RANGE PIC 9(12) VALUE ZERO.
05 HIGH-RANGE PIC 9(12) VALUE ZERO.
05 YOU-PROVIDE-SEED PIC X VALUE SPACE.
05 RUN-AGAIN PIC X VALUE SPACE.
05 PAUSE-FOR-A-SECOND PIC X VALUE SPACE.
01 SEED-TIME.
05 HOURS PIC 99.
05 MINUTES PIC 99.
05 SECONDS PIC 99.
05 MS PIC 99.
*
* LINKAGE SECTION.
* Not used during testing
01 RANDGEN-AREA.
05 COUNT-OF-ACCESSES PIC 9(12) VALUE ZERO.
05 RANDOM-NUMBERS-NEEDED PIC 9(12) VALUE ZERO.
05 RANDOM-NUMBER PIC 9(12) VALUE ZERO.
05 RANDOM-MSG PIC X(60) VALUE SPACE.
*
* PROCEDURE DIVISION USING RANDGEN-AREA.
* Not used during testing
*
PROCEDURE DIVISION.
100-RANDGEN-EDIT-HOUSEKEEPING.
MOVE SPACE TO RANDOM-MSG.
IF RANDOM-NUMBERS-NEEDED = ZERO
DISPLAY 'RANDOM-NUMBERS-NEEDED ' NO ADVANCING
ACCEPT RANDOM-NUMBERS-NEEDED.
IF RANDOM-NUMBERS-NEEDED NOT NUMERIC
MOVE 'RANDOM-NUMBERS-NEEDED NOT NUMERIC' TO RANDOM-MSG
GO TO 900-EXIT-RANDGEN.
IF RANDOM-NUMBERS-NEEDED = ZERO
MOVE 'RANDOM-NUMBERS-NEEDED = ZERO' TO RANDOM-MSG
GO TO 900-EXIT-RANDGEN.
IF COUNT-OF-ACCESSES NOT NUMERIC
MOVE 'COUNT-OF-ACCESSES NOT NUMERIC' TO RANDOM-MSG
GO TO 900-EXIT-RANDGEN.
IF COUNT-OF-ACCESSES GREATER THAN RANDOM-NUMBERS-NEEDED
MOVE 'COUNT-OF-ACCESSES > THAT RANDOM-NUMBERS-NEEDED'
TO RANDOM-MSG
GO TO 900-EXIT-RANDGEN.
IF YOU-PROVIDE-SEED = SPACE AND RANDOM-NUMBER = ZERO
DISPLAY 'DO YOU WANT TO PROVIDE SEED Y OR N: '
NO ADVANCING
ACCEPT YOU-PROVIDE-SEED.
IF RANDOM-NUMBER = ZERO AND
(YOU-PROVIDE-SEED = 'Y' OR 'y')
DISPLAY 'ENTER SEED ' NO ADVANCING
ACCEPT RANDOM-NUMBER.
IF RANDOM-NUMBER NOT NUMERIC
MOVE 'RANDOM-NUMBER NOT NUMERIC' TO RANDOM-MSG
GO TO 900-EXIT-RANDGEN.
200-RANDGEN-DATA-HOUSEKEEPING.
MOVE FUNCTION CURRENT-DATE (9:8) TO SEED-TIME.
IF COUNT-OF-ACCESSES = ZERO
COMPUTE LOW-RANGE =
((SECONDS * HOURS * MINUTES * MS) / 3).
COMPUTE RANDOM-NUMBER-BUILT = RANDOM-NUMBER + LOW-RANGE.
COMPUTE HIGH-RANGE = RANDOM-NUMBERS-NEEDED + LOW-RANGE.
MOVE X2-POWER TO 2X2.
300-SET-2X2-DIVISOR.
IF 2X2 < (HIGH-RANGE + 1)
COMPUTE 2X2 = 2X2 * X2-POWER
GO TO 300-SET-2X2-DIVISOR.
* *********************************************************
* IF FIRST TIME THROUGH AND YOU WANT TO BUILD A SEED. *
* *********************************************************
IF COUNT-OF-ACCESSES = ZERO AND RANDOM-NUMBER = ZERO
COMPUTE RANDOM-NUMBER-BUILT =
((SECONDS * HOURS * MINUTES * MS) + HIGH-RANGE).
IF COUNT-OF-ACCESSES = ZERO
DISPLAY 'SEED TIME ' SEED-TIME
' RANDOM-NUMBER-BUILT ' RANDOM-NUMBER-BUILT
' LOW-RANGE ' LOW-RANGE.
* *********************************************
* END OF BUILDING A SEED IF YOU WANTED TO *
* *********************************************
* ***************************************************
* THIS PROCESS IS WHERE THE RANDOM-NUMBER IS BUILT *
* ***************************************************
400-RANDGEN-FORMULA.
COMPUTE FIRST-PART = (5 * RANDOM-NUMBER-BUILT) + 7.
DIVIDE FIRST-PART BY 2X2 GIVING WORKING-NUMBER
REMAINDER RANDOM-NUMBER-BUILT.
IF RANDOM-NUMBER-BUILT > LOW-RANGE AND
RANDOM-NUMBER-BUILT < (HIGH-RANGE + 1)
GO TO 600-RANDGEN-CLEANUP.
GO TO 400-RANDGEN-FORMULA.
* *********************************************
* GOOD RANDOM NUMBER HAS BEEN BUILT *
* *********************************************
600-RANDGEN-CLEANUP.
ADD 1 TO COUNT-OF-ACCESSES.
COMPUTE RANDOM-NUMBER =
RANDOM-NUMBER-BUILT - LOW-RANGE.
* *******************************************************
* THE NEXT 3 LINE OF CODE ARE FOR TESTING ON CONSOLE *
* *******************************************************
DISPLAY RANDOM-NUMBER.
IF COUNT-OF-ACCESSES < RANDOM-NUMBERS-NEEDED
GO TO 100-RANDGEN-EDIT-HOUSEKEEPING.
900-EXIT-RANDGEN.
IF RANDOM-MSG NOT = SPACE
DISPLAY 'RANDOM-MSG: ' RANDOM-MSG.
MOVE ZERO TO COUNT-OF-ACCESSES RANDOM-NUMBERS-NEEDED RANDOM-NUMBER.
MOVE SPACE TO YOU-PROVIDE-SEED RUN-AGAIN.
DISPLAY 'RUN AGAIN Y OR N '
NO ADVANCING.
ACCEPT RUN-AGAIN.
IF (RUN-AGAIN = 'Y' OR 'y')
GO TO 100-RANDGEN-EDIT-HOUSEKEEPING.
ACCEPT PAUSE-FOR-A-SECOND.
GOBACK.
Let's say you want to go over shuffled lists over and over, without having the O(n) delay each time you start over to shuffle it again, in that case we can do this:
Create 2 lists A and B, with 0 to 1000, takes 2n space.
Shuffle list A using Fisher-Yates, takes n time.
When drawing a number, do 1-step Fisher-Yates shuffle on the other list.
When cursor is at list end, switch to the other list.
Preprocess
cursor = 0
selector = A
other = B
shuffle(A)
Draw
temp = selector[cursor]
swap(other[cursor], other[random])
if cursor == N
then swap(selector, other); cursor = 0
else cursor = cursor + 1
return temp
Another posibility:
You can use an array of flags. And take the next one when it;s already chosen.
But, beware after 1000 calls, the function will never end so you must make a safeguard.
Most of the answers here fail to guarantee that they won't return the same number twice. Here's a correct solution:
int nrrand(void) {
static int s = 1;
static int start = -1;
do {
s = (s * 1103515245 + 12345) & 1023;
} while (s >= 1001);
if (start < 0) start = s;
else if (s == start) abort();
return s;
}
I'm not sure that the constraint is well specified. One assumes that after 1000 other outputs a value is allowed to repeat, but that naively allows 0 to follow immediately after 0 so long as they both appear at the end and start of sets of 1000. Conversely, while it's possible to keep a distance of 1000 other values between repetitions, doing so forces a situation where the sequence replays itself in exactly the same way every time because there's no other value that has occurred outside of that limit.
Here's a method that always guarantees at least 500 other values before a value can be repeated:
int nrrand(void) {
static int h[1001];
static int n = -1;
if (n < 0) {
int s = 1;
for (int i = 0; i < 1001; i++) {
do {
s = (s * 1103515245 + 12345) & 1023;
} while (s >= 1001);
/* If we used `i` rather than `s` then our early results would be poorly distributed. */
h[i] = s;
}
n = 0;
}
int i = rand(500);
if (i != 0) {
i = (n + i) % 1001;
int t = h[i];
h[i] = h[n];
h[n] = t;
}
i = h[n];
n = (n + 1) % 1001;
return i;
}
When N is greater than 1000 and you need to draw K random samples you could use a set that contains the samples so far. For each draw you use rejection sampling, which will be an "almost" O(1) operation, so the total running time is nearly O(K) with O(N) storage.
This algorithm runs into collisions when K is "near" N. This means that running time will be a lot worse than O(K). A simple fix is to reverse the logic so that, for K > N/2, you keep a record of all the samples that have not been drawn yet. Each draw removes a sample from the rejection set.
The other obvious problem with rejection sampling is that it is O(N) storage, which is bad news if N is in the billions or more. However, there is an algorithm that solves that problem. This algorithm is called Vitter's algorithm after it's inventor. The algorithm is described here. The gist of Vitter's algorithm is that after each draw, you compute a random skip using a certain distribution which guarantees uniform sampling.
Fisher Yates
for i from n−1 downto 1 do
j ← random integer such that 0 ≤ j ≤ i
exchange a[j] and a[i]
It is actually O(n-1) as you only need one swap for the last two
This is C#
public static List<int> FisherYates(int n)
{
List<int> list = new List<int>(Enumerable.Range(0, n));
Random rand = new Random();
int swap;
int temp;
for (int i = n - 1; i > 0; i--)
{
swap = rand.Next(i + 1); //.net rand is not inclusive
if(swap != i) // it can stay in place - if you force a move it is not a uniform shuffle
{
temp = list[i];
list[i] = list[swap];
list[swap] = temp;
}
}
return list;
}
Please see my answer at https://stackoverflow.com/a/46807110/8794687
It is one of the simplest algorithms that have average time complexity O(s log s), s denoting the sample size. There are also some links there to hash table algorithms who's complexity is claimed to be O(s).
Someone posted "creating random numbers in excel". I am using this ideal.
Create a structure with 2 parts, str.index and str.ran;
For 10 random numbers create an array of 10 structures.
Set the str.index from 0 to 9 and str.ran to different random number.
for(i=0;i<10; ++i) {
arr[i].index = i;
arr[i].ran = rand();
}
Sort the array on the values in arr[i].ran.
The str.index is now in a random order.
Below is c code:
#include <stdio.h>
#include <stdlib.h>
struct RanStr { int index; int ran;};
struct RanStr arr[10];
int sort_function(const void *a, const void *b);
int main(int argc, char *argv[])
{
int cnt, i;
//seed(125);
for(i=0;i<10; ++i)
{
arr[i].ran = rand();
arr[i].index = i;
printf("arr[%d] Initial Order=%2d, random=%d\n", i, arr[i].index, arr[i].ran);
}
qsort( (void *)arr, 10, sizeof(arr[0]), sort_function);
printf("\n===================\n");
for(i=0;i<10; ++i)
{
printf("arr[%d] Random Order=%2d, random=%d\n", i, arr[i].index, arr[i].ran);
}
return 0;
}
int sort_function(const void *a, const void *b)
{
struct RanStr *a1, *b1;
a1=(struct RanStr *) a;
b1=(struct RanStr *) b;
return( a1->ran - b1->ran );
}
I have found a small algorithm to determine if a number is power of 2, but not an explanation for how it works, what really happens?
var potence = n => n && !(n & (n - 1));
for(var i = 2; i <= 16; ++i) {
if(potence(i)) console.log(i + " is potence of 2");
}
I'll explain how it works for non-negative n. The first condition in n && !(n & (n - 1)) simply checks that n is not zero. If n is not zero, then it has some least significant 1-bit at some position p. Now, if you subtract 1 from n, all bits before position p will change to 1, and the bit at p will flip to 0.
Something like this:
n: 1010100010100111110010101000000
n-1: 1010100010100111110010100111111
^ position p
Now, if you & these two bit-patterns, all the stuff after the position p remains unchanged, and everything before (and including p) is zeroed out:
after &: 1010100010100111110010100000000
^ position p
If the result after taking & happens to be zero, then it means that there was nothing after position p, thus the number must have been
2^p, which looked like this:
n: 0000000000000000000000001000000
n - 1: 0000000000000000000000000111111
n&(n-1): 0000000000000000000000000000000
^ position p
thus n is a power of 2. If the result of & is not zero (as in the first example), then it means that there was some junk in the more significant bits after the p-th position, and therefore n is not a power of 2.
I'm too lazy to play this through for the 2-complement representation of negative numbers.
If one number is potence of 2, then must be 10...0 in binary representation. Minus by 1, then then leading 1 should be 0, so that n & (n-1) to be 0. Otherwise, it is not the potence of 2.
Kinka's answer is essentially correct, but perhaps needs a bit more detail on the "otherwise" case. If the number isn't a power of two, then it must have the form n=(2^a + 2^(b) + y), where a>b and y <2^b. Subtracting 1 from that must be strictly greater than 2^a, so (n & (n-1)) is at least 2^a, and therefore non-zero.
var ShortURL = new function() {
var _alphabet = '23456789bcdfghjkmnpqrstvwxyzBCDFGHJKLMNPQRSTVWXYZ-_',
_base = _alphabet.length;
this.encode = function(num) {
var str = '';
while (num > 0) {
str = _alphabet.charAt(num % _base) + str;
num = Math.floor(num / _base);
}
return str;
};
this.decode = function(str) {
var num = 0;
for (var i = 0; i < str.length; i++) {
num = num * _base + _alphabet.indexOf(str.charAt(i));
}
return num;
};
};
I understand encode works by converting from decimal to custom base (custom alphabet/numbers in this case)
I am not quite sure how decode works.
Why do we multiply base by a current number and then add the position number of the alphabet? I know that to convert 010 base 2 to decimal, we would do
(2 * 0^2) + (2 * 1^1) + (2 * 0 ^ 0) = 2
Not sure how it is represented in that decode algorithm
EDIT:
My own decode version
this.decode2 = function (str) {
var result = 0;
var position = str.length - 1;
var value;
for (var i = 0; i < str.length; i++) {
value = _alphabet.indexOf(str[i]);
result += value * Math.pow(_base, position--);
}
return result;
}
This is how I wrote my own decode version (Just like I want convert this on paper. I would like someone to explain more in detail how the first version of decode works. Still don't get why we multiply num * base and start num with 0.
OK, so what does 376 mean as a base-10 output of your encode() function? It means:
1 * 100 +
5 * 10 +
4 * 1
Why? Because in encode(), you divide by the base on every iteration. That means that, implicitly, the characters pushed onto the string on the earlier iterations gain in significance by a factor of the base each time through the loop.
The decode() function, therefore, multiplies by the base each time it sees a new character. That way, the first digit is multiplied by the base once for every digit position past the first that it represents, and so on for the rest of the digits.
Note that in the explanation above, the 1, 5, and 4 come from the positions of the characters 3, 7, and 6 in the "alphabet" list. That's how your encoding/decoding mechanism works. If you feed your decode() function a numeric string encoded by something trying to produce normal base-10 numbers, then of course you'll get a weird result; that's probably obvious.
edit To further elaborate on the decode() function: forget (for now) about the special base and encoding alphabet. The process is basically the same regardless of the base involved. So, let's look at a function that interprets a base-10 string of numeric digits as a number:
function decode10(str) {
var num = 0, zero = '0'.charCodeAt(0);
for (var i = 0; i < str.length; ++i) {
num = (num * 10) + (str[i] - zero);
}
return num;
}
The accumulator variable num is initialized to 0 first, because before examining any characters of the input numeric string the only value that makes sense to start with is 0.
The function then iterates through each character of the input string from left to right. On each iteration, the accumulator is multiplied by the base, and the digit value at the current string position is added.
If the input string is "214", then, the iteration will proceed as follows:
num is set to 0
First iteration: str[i] is 2, so (num * 10) + 2 is 2
Second iteration: str[i] is 1, so (num * 10) + 1 is 21
Third iteration: str[i] is 4, so (num * 10) + 4 is 214
The successive multiplications by 10 achieve what the call to Math.pow() does in your code. Note that 2 is multiplied by 10 twice, which effectively multiplies it by 100.
The decode() routine in your original code does the same thing, only instead of a simple character code computation to get the numeric value of a digit, it performs a lookup in the alphabet string.
Both the original and your own version of the decode function achieve the same thing, but the original version does it more efficiently.
In the following assignment:
num = num * _base + _alphabet.indexOf(str.charAt(i));
... there are two parts:
_alphabet.indexOf(str.charAt(i))
The indexOf returns the value of a digit in base _base. You have this part in your own algorithm, so that should be clear.
num * _base
This multiplies the so-far accumulated result. The rest of my answer is about that part:
In the first iteration this has no effect, as num is still 0 at that point. But at the end of the first iteration, num contains the value as if the str only had its left most character. It is the base-51 digit value of the left most digit.
From the next iteration onwards, the result is multiplied by the base, which makes room for the next value to be added to it. It functions like a digit shift.
Take this example input to decode:
bd35
The individual characters represent value 8, 10, 1 and 3. As there are 51 characters in the alphabet, we're in base 51. So bd35 this represents value:
8*51³ + 10*51² + 1*51 + 3
Here is a table with the value of num after each iteration:
8
8*51 + 10
8*51² + 10*51 + 1
8*51³ + 10*51² + 1*51 + 3
Just to make the visualisation cleaner, let's put the power of 51 in a column header, and remove that from the rows:
3 2 1 0
----------------------------
8
8 10
8 10 1
8 10 1 3
Note how the 8 shifts to the left at each iteration and gets multiplied with the base (51). The same happens with 10, as soon as it is shifted in from the right, and the same with the 1, and 3, although that is the last one and doesn't shift any more.
The multiplication num * _base represents thus a shift of base-digits to the left, making room for a new digit to shift in from the right (through simple addition).
At the last iteration all digits have shifted in their correct position, i.e. they have been multiplied by the base just enough times.
Putting your own algorithm in the same scheme, you'd have this table:
3 2 1 0
----------------------------
8
8 10
8 10 1
8 10 1 3
Here, there is no shifting: the digits are immediately put in the right position, i.e. they are multiplied with the correct power of 51 immediately.
You ask
I would like to understand how the decode function works from logical perspective. Why are we using num * base and starting with num = 0.
and write that
I am not quite sure how decode works. Why do we multiply base by a
current number and then add the position number of the alphabet? I
know that to convert 010 base 2 to decimal, we would do
(2 * 0^2) + (2 * 1^1) + (2 * 0 ^ 0) = 2
The decode function uses an approach to base conversion known as Horner's rule, used because it is computationally efficient:
start with a variable set to 0, num = 0
multiply the variable num by the base
take the value of the most significant digit (the leftmost digit) and add it to num,
repeat step 2 and 3 for as long as there are digits left to convert,
the variable num now contains the converted value (in base 10)
Using an example of a hexadecimal number A5D:
start with a variable set to 0, num = 0
multiply by the base (16), num is now still 0
take the value of the most significant digit (the A has a digit value of 10) and add it to num, num is now 10
repeat step 2, multiply the variable num by the base (16), num is now 160
repeat step 3, add the hexadecimal digit 5 to num, num is now 165
repeat step 2, multiply the variable num by the base (16), num is now 2640
repeat step 3, add the hexadecimal digit D to num (add 13)
there are no digits left to convert, the variable num now contains the converted value (in base 10), which is 2653
Compare the expression of the standard approach:
(10 × 162) + (5 × 161) + (13 × 160) = 2653
to the use of Horner's rule:
(((10 × 16) + 5) × 16) + 13 = 2653
which is exactly the same computation, but rearranged in a form making it easier to compute. This is how the decode function works.
Why are we using num * base and starting with num = 0.
The conversion algorithm needs a start value, therefore num is set to 0. For each repetition (each loop iteration), num is multiplied by base. This only has any effect on the second iteration, but is written like this to make it easier to write the conversion as a for loop.
I'm trying to get a handle on the interaction between different types of typed arrays.
Example 1.
var buf = new ArrayBuffer(2);
var arr8 = new Int8Array(buf);
var arr8u = new Uint8Array(buf);
arr8[0] = 7.2;
arr8[1] = -45.3;
console.log(arr8[0]); // 7
console.log(arr8[1]); // -45
console.log(arr8u[0]); // 7
console.log(arr8u[1]); // 211
I have no problem with the first three readouts but where does 211 come from in the last. Does this have something to do with bit-shifting because of the minus sign.
Example 2
var buf = new ArrayBuffer(4);
var arr8 = new Int8Array(buf);
var arr32 = new Int32Array(buf);
for (var i = 0; i < buf.byteLength; i++){
arr8[i] = i;
}
console.log(arr8); // [0, 1, 2, 3]
console.log(arr32); // [50462976]
So where does the 50462976 come from?
Example #1
Examine positive 45 as a binary number:
> (45).toString(2)
"101101"
Binary values are negated using a two's complement:
00101101 => 45 signed 8-bit value
11010011 => -45 signed 8-bit value
When we read we read 11010011 as an unsigned 8-bit value, it comes out to 211:
> parseInt("11010011", 2);
211
Example #2
If you print 50462976 in base 2:
> (50462976).toString(2);
"11000000100000000100000000"
We can add leading zeros and rewrite this as:
00000011000000100000000100000000
And we can break it into octets:
00000011 00000010 00000001 00000000
This shows binary 3, 2, 1, and 0. The storage of 32-bit integers is big-endian. The 8-bit values 0 to 3 are read in order of increasing significance when constructing the 32-bit value.
First question.
Signed 8bit integers range from -128 to 127. The positive part (0-127) maps to binary values from 00000000 to 01111111), and the other half (-128-1) from 10000000 to 11111111.
If you omit the first bit, you can create a number by adding a 7bit number to a boundary. In your case, the binary representation is 11010011. The first bit is 1, this means the number will be negative. The last 7bits are 1010011, that gives us value 83. Add it to the boundary: -128 + 83 = -45. That’s it.
Second question.
32bit integers are represented by four bytes in memory. You are storing four 8bit integers in the buffer. When converted to an Int32Array, all those values are combined to form one value.
If this was decimal system, you could think of it as combining "1" and "2" gives "12". It’s similar in this case, except the multipliers are different. For the first value, it’s 2^24. Then it’s 2^16, then 2^8 and finally 2^0. Let’s do the math:
2^24 * 3 + 2^16 * 2 + 2^8 * 1 + 2^0 * 0 =
16777216 * 3 + 65536 * 2 + 256 * 1 + 1 * 0 =
50331648 + 131072 + 256 + 0 =
50462976
That’s why you’re seeing such a large number.
Is there a way to correctly multiply two 32 bit integers in Javascript?
When I try this from C using long long I get this:
printf("0x%llx * %d = %llx\n", 0x4d98ee96ULL, 1812433253,
0x4d98ee96ULL * 1812433253);
==> 0x4d98ee96 * 1812433253 = 20becd7b431e672e
But from Javascript the result is different:
x = 0x4d98ee97 * 1812433253;
print("0x4d98ee97 * 1812433253 = " + x.toString(16));
==> 0x4d98ee97 * 1812433253 = 20becd7baf25f000
The trailing zeros lead me to suspect that Javascript has an oddly limited integer resolution somewhere between 32 and 64 bits.
Is there a way to get a correct answer? (I'm using Mozilla js-1.8.5 on x86_64 Fedora 15 in case that matters.)
This seems to do what I wanted without an external dependency:
function multiply_uint32(a, b) {
var ah = (a >> 16) & 0xffff, al = a & 0xffff;
var bh = (b >> 16) & 0xffff, bl = b & 0xffff;
var high = ((ah * bl) + (al * bh)) & 0xffff;
return ((high << 16)>>>0) + (al * bl);
}
This performs a 32-bit multiply modulo 2^32, which is the correct bottom half of the computation. A similar function could be used to calculate a correct top half and store it in a separate integer (ah * bh seems right), but I don't happen to need that.
Note the zero-shift. Without that the function generates negative values whenever the high bit is set.
From a forum post:
there's no need to make numbers small, it only
matters to keep the number of significant digits below 53
function mult32s(n, m) //signed version
{
n |= 0;
m |= 0;
var nlo = n & 0xffff;
var nhi = n - nlo;
return ( (nhi * m | 0) + (nlo * m) ) | 0;
}
function mult32u(n, m) //unsigned version
{
n >>>= 0;
m >>>= 0;
var nlo = n & 0xffff;
var nhi = n - nlo;
return ( (nhi * m >>> 0) + (nlo * m) ) >>> 0;
}
Both | and >>> operators cause the result to be converted to 32-bit integer. In the first case it is converted to a signed integer, in the second case it is converted to an unsigned integer.
In the line of multiplication the first | / >>> operator causes the 64-bit intermediate result with 48-bit significand (in the form 0x NNNN NNNN NNNN 0000) to drop its higher bits, so the intermediate result is in the form 0x NNNN 0000.
The second | / >>> operator causes the result of second-multiplication-and-addition to be limited to 32 bits.
In case one of the multiplicands is a constant you can simplify the multiplication further:
function mult32s_with_constant(m) //signed version
{
m |= 0
//var n = 0x12345678;
var nlo = 0x00005678;
var nhi = 0x12340000;
return ( (nhi * m | 0) + (nlo * m) ) | 0;
}
Or, if you know that the result is going to be less than 53 bits, then you can do just:
function mult32s(n, m) //signed version
{
n |= 0;
m |= 0;
return ( n * m ) | 0;
}
You'll likely need to make use of a third-party Javascript library to handle large-number precision.
For example, BigInt.js: http://www.leemon.com/crypto/BigInt.js
You are correct. Javascript integers are treated as floats, which have poor precision when dealing with ints.
In javascript, it is the case that 10000000000000001%2 == 0
A friend also mentions that 10000000000000001 == 10000000000000000, and that this is indeed due to the spec (though ints are used for optimization, the spec still requires float-like behavior).
Though once you're in this territory, you're already nearly at the limit of 64bit int precision.
GWT has a emulation of the Java (64-bit) signed long integer type. I made a JavaScript interface for it, here's a demo. Using the default numbers, you can see that the value is the same as the one you'd get in C.
The hex value in the "emulation" column should correspond with what you see in your debugger, but there may be problems with the hex representation since I'm using native JavaScript to make it. It could be done in GWT too of course, that would probably make it more correct. If the JavaScript Number can represent all the string representations that GWT generates, the hex representation should be correct too. Look in the source for usage. The reason they ({sub,mod,div,mul}ss) take strings is because I don't know how to make a GWT Long object from JavaScript.