Given a five dimensional space, I would like to generate 100 vectors, all with a fixed magnitude=M, where the components values are randomly distributed.
I was originally thinking of starting off with a unit vector and then applying a rotation matrix, with random parameters for the 10 degrees of freedom ... Would this work? and how?
Any nice way of doing this in Javascript...?
cheers for any pointers!
Here is the Monte Carlo algorithm that I would use (I do not know Javascript well enough to code in it off the top of my head):
Generate random values in the range from -1 to 1 for each of the five dimensions.
Calculate the Magnitude M, if M=0 or M>1 then reject these values and return to step #1.
Normalize the vector to have a Magnitude of 1 (divide each dimension by M).
That should give you random unit vectors evenly distributed over the 5-dimensional super-sphere surface.
The question has been asked: "Why reject the vector if M>1?"
Answer: So that the final vectors will be uniformly distributed across the surface of the unit 5-sphere.
Reasoning: What we are generating in the first step is a set of random vectors that are uniformly distributed within the volume of the unit 5-cube. Some of those vectors are also within the volume of the unit 5-sphere and some of them are outside of that volume. If normalized, the vectors within the 5-sphere are evenly distributed across its surface, however, the ones outside it are not at all evenly distributed.
Think about it like this: Just as with a normal 3-dimensional Unit Cube and Unit Sphere, or even the Unit Square and the Unit Circle, the Unit 5-Sphere is wholly contained within the Unit 5-Cube, which touches only at the five positive unit dimension axis points:
(1,0,0,0,0)
(0,1,0,0,0)
(0,0,1,0,0)
(0,0,0,1,0)
(0,0,0,0,1)
and their corresponding negative unit axis points. This is because these are the only points on the surface of the cube that have a magnitude (distance from the origin) of 1, at all other points, the 5-cube's surface has a distance from the origin that is greater than 1.
And this means that there are many more points between (0,0,0,0,0) and (1,1,1,1,1) than there are between (0,0,0,0,0) and (1,0,0,0,0). In fact about SQRT(5) or aprx. 2.25 times more.
And that means that if you included all of the vectors in the unit 5-cube, you would end up with more than twice as many results "randomly" mapping to about (0.44,0.44,0.44,0.44,0.44) than to (1,0,0,0,0).
For those who are challenging (without foundation, IMHO) that this results in a uniform distribution across the surface of the 5-D Sphere, please see the alternative method in this Wikipedia article section: https://en.wikipedia.org/wiki/N-sphere#Uniformly_at_random_on_the_(n_%E2%88%92_1)-sphere
The problem with sampling from a unit hypercube in 5-dimeensions and then re-scaling, is points in some directions (towards the corners of the hypercube) will be over sampled.
But if you use a rejection scheme, then you lose too many samples. That is, the volume of a unit hypercube in 5-d is pi^2*(8/15) = 5.26378901391432. Compare that to the volume of a unit hyper-cube in 5 dimensions that will just contain the sphere. That hypercube has volume 32. So if you reject points falling outside the sphere, you will reject
1 - 5.26378901391432/32 = 0.835506593315178
or roughly 83.5% of the points get rejected. That means you will need to sample roughly 6 points on average before you do find a sample that is inside the 5-sphere.
A far better idea is to sample using a unit normal sample, then rescale that vector of points to have unit norm. Since the multi-variate normal distribution is spherically symmetric, there is no need for rejection at all.
Here are some approaches, it is for unit vectors but you can just multiply by M:
http://burtleburtle.net/bob/rand/unitvec.html
I'd recommend assigning random numbers between -1 and +1 for each element. Once all elements for a vector have been assigned, then you should normalize the vector. To normalize, simply divide each element by the magnitude of the overall vector. Once you've done that, you've got a random vector with a magnitude of 1. If you want to scale the vectors to magnitude M, you just need to multiply each element by M.
Picking without rejections (6 times faster)
From Mathworld and a link posted by Bitwise:
If you choose n numbers with a normal distribution, treat those as coordinates, and normalize the resulting vector into a unit vector, this chooses a uniformly distributed unit vector in n dimensions. - Burtleburtle (ht Bitwise)
As Bitwise points out, that should be multiplied by the desired magnitude.
Note that there is no rejection step here; the distributions themselves have dealt with the necessary bias towards zero. Since the normal distribution and a semi-circle are not the same shape, I wonder if RBarryYoung's answer really gives a uniform distribution on the surface of the sphere or not.
Regarding uniform hypercube picking with hypersphere rejection (RBarryYoung's answer)
The Mathworld article includes a description of using 4 random numbers picked from uniform distributions to calculate random vectors on the surface of a 3d sphere. It performs rejection on the 4d vector of random numbers of anything outside the unit hypersphere, as RBarryYoung does, but it differs in that it (a)uses an extra random number and (b)performs a non-linear transform on the numbers to extract the 3d unit vector.
To me, that implies that uniform distributions on each axis with hypersphere rejection will not achieve a uniform distribution over the surface of the sphere.
Related
In my situation I need to compare the length of 2 bezier curves. I do not need to compute the actual length of either curve. I merely want to cheaply compare which of the 2 is longer. My assumptions for this method are as followed:
Both Bezier curves to compare are same dimension(number of control points)
The dimension of the curves could be any number greater than 2
I need to output which of the 2 curves is longer (either if equal)
My original thought, was to just add the lengths of control points ie:
distance(p0, p1) + distance(p1, p2) + distance(p2, p3)...
And It seems to work decently for lower order bezier curves. However I sure that this would not scale well in higher order curves.
I ended with a solution that adds the distance between each control point projected on the curve(basically take number of control points / index of point and using that value as T), and seems to work on some higher dimension curves.
I can't imagine I am the first person to want to do this, So to reiterate does anyone know of the right way to do this?
This algorithm is to find the best point to meet, such that the distance travelled by all people is minimum.
To elaborate -
Consider the below line, an x-axis with each person in different point to the 0th position(imagine x-y axis). Each point denotes the distance he is from 0th position.
|
---30-------15-----10--------5----0----6-----------20-----------40-----50--
|
Now Find come up with an algorithm to find the point where each person has to travel and get together such that the total distance travelled is minimum.
Note - I thought of finding median/average, does not work always.
How about choosing nearest point to 0th position? Again not always.
Any ideas guys?
Assuming that all positions are in one dimension (i.e. along one axis only), optimal solution is median of all positions.
The median is defined such that half the values are larger than, and half are smaller than, the median. If elements in the sample data increase arithmetically, when placed in some order, then the median and arithmetic average are equal. For example, consider the data sample {1,2,3,4}. The average is 2.5, as is the median. However, when we consider a sample that cannot be arranged so as to increase arithmetically, such as {1,2,4,8,16}}, the median and arithmetic average can differ significantly. In this case, the arithmetic average is 6.2 and the median is 4. In general, the average value can vary significantly from most values in the sample, and can be larger or smaller than most of them.
Source: https://en.wikipedia.org/wiki/Arithmetic_mean
In above example,
total distance with average value (6.2) as solution = 23.2
total distance with median value (4) as solution = 21, which of course is the lower distance, hence optimal solution
Hi guys:)Could someone explain me this code?I am trying to understand but there is nothing to do.Why this line of code?
Math.sqrt(x_dist*x_dist+y_dist*y_dist)/interval;
Isn't sufficent this?
x_dist+y_dist/interval;
I don't understand the concept of this code...
https://jsfiddle.net/vnodkumar1987/ER8qE/
The first example calculates the hypotenuse, and in so doing achieves an absolute velocity value of the mouse vector.
The second example will give a bad result unless both x_dist and y_dist are positive. In other words, if you were moving down and left, or up and right, the second example would have a subtractive effect, and not represent the true overall velocity. In the case of up and left, the velocity would not only be proportionately incorrect (only useful for comparison purposes), but also result negative sign that you would have to account for. (I am assuming 0,0 represents the upper left of the mouse-able area and x_max,y_max to be the lower right.)
The Math.sqrt may not be necessary if you are just scaling proportionate velocity, but it certainly is if you want to know true pixels/interval. You would also have to take into account how big a variable container you are working with, but I'm sure it would all fit into a double... unless you were looking for extreme precision.
Imagine you travel in a straight line so that you end up at a point 3 miles West, and 4 miles South in exactly 1 hour. The velocity answer is not 3+4=7 miles per hour, nor is it-3+4=1 miles per hour. The correct answer of absolute velocity is the hypotenuse, which would be 5 mph. sqrt(west^2+south^2)
Example #1 would be the proper code. Example #2 could be roughly used if you can ignore the sign, and you needed the code to execute very quickly.
The velocity is distance_travelled/time_taken.
Say the pointer moves from (x1,y1) to (x2,y2) as shown in the figure above. The distance travelled is not the sum of the x and y distances.
Summing up x and y assumes that the pointer went from (x1,y1) to (x2,y1) and then from (x2,y1) to (x2,y2). i.e. the sum of the lengths of the 2 blue lines. But what you need is the length of the black line.
The actual distance travelled is d as shown in the figure. Using Pythagorean theorem, d^2 = x_dist^2 + y_dist^2.
Which leaves you with the line of code you have in the question for the speed
Math.sqrt(x_dist*x_dist+y_dist*y_dist)/interval;
You are making a pythagorean triangle with the two catethus being x_dist and y_dist, which are the distance the mouse moved in each of X and Y axis each frame. What that line of code does is to get the magnitude of the delta position vector of the mouse and divide it by some scalar value.
Also, note that sqrt(a^2 + b^2) does NOT equal a + b.
EDIT: Not velocity, but delta position.
In a software I am working on (sensor simulation), I needed to generate normally distributed noise for simulated sensor signals. I used the central limit theorem. I generated 20 random numbers and built an average out of them to approximate the gaussian distribution.
So I took the "measured" signal and generated 20 numbers from -noiseMax to +noiseMax and averaged them. I added the result to the signal to have noise.
Now, for my university, I have to describe this Gaussian distribution by its mean and variance. Ok, mean will be 0 but I have absolutely no idea how to convert noiseMax in my program into the variance. Googling haven't helped much.
I was not sure if SO is the right SE platform for this question. Sorry if it isn't.
OK, so the central limit theorem says that the average of a sufficiently large number of uniformally distributed variables will be normal. In the statistics classes I have taken, 30 is usually used as the cutoff, so you might want to increase your simulation's "sample size".
However, you can find the Standard Deviation of your average as follows regardless of "sample size".
The standard deviation of your uniform variable is (b-a)/sqrt(12)== noiseMax/sqrt(3).
Variances add when you add variables, so the standard deviation of n the sum of n of these variables is sqrt(n*(noiseMax/sqrt(3))*(noiseMax/sqrt(3)))==noiseMax*sqrt(n/3).
Dividing by n to get the average gives you a final standard deviation of noiseMax/sqrt(3*n). In your case, sigma = noiseMax * 0.12909944487.
From theoretical POV this is known as Irwin-Hall distribution
Simplest to produce N(0,1) is sum of 12 uniform RN minus 6, no need for scaling
In general, to see how variance is computed, take a look at
http://en.wikipedia.org/wiki/Irwin%E2%80%93Hall_distribution
I would also recommend to look at the Table of Numerical Values in the following article: http://en.wikipedia.org/wiki/68%E2%80%9395%E2%80%9399.7_rule.
For example, if one to use sum of 12 uniform numbers (minus 6), then min value would be at -6 (exactly -6*sigma) and max value would be at +6 (exactly +6*sigma). Looking at the table, what would be expected frequency outside the range? Answer: 1/506797346. Thus, one out of ~half a billion events shall land outside the +-6sigma, but Irwill-Hall(12) rng will miss it. Thus, you could judge if it is ok or not for your particular simulation
I was wondering whether there was any advantage to clamping the angle passed to trigonometric functions between 0 and Math.PI * 2? I had a function which made heavy use of trigonometric functions, and someone in the project added this to the beggining:
angle %= Math.PI * 2;
Is there any advantage to this? Are the trigonometric functions faster if the angle passed is between those values? If so, shouldn't they clamp it themselves? Is there any other case where equivalent angles should be clamped?
The language is JavaScript, most likely to be run on V8 and SpiderMonkey.
Since most (on-die) algorithms for computing trigonometric functions use some variant of CORDIC, my bet is that those values are getting clamped within [0, Pi/2) anyway at the entry point of the trig function call.
That being said, if you have a way to keep the angles close to zero throughout the algorithm, it is probably wise to do it. Indeed, the value of sin(10^42) is pretty much undefined, since the granularity in the 10^42 range is around 10^25.
This means for instance that if you are to add angles, and if by doing so, they can get large in magnitude, then you should consider periodically clamping them. But it is unneccessary to clamp them just before the trigonometric function call.
An advantage of clamping angles to the range -pi/4 to pi/4 (use sine or cosine as appropriate) is that you can ensure that if the angles are computed using some approximation of pi, range reduction is performed using that same approximation. Such an approach will have two benefits: it will improve the accuracy of things like the sine of 180 degrees or the cosine of 90 degrees, and it will avoid having math libraries waste computational cycles in an effort to perform super-accurate range reduction by a "more precise" approximation of pi which doesn't match the one used in computing the angles.
Consider, for example, the sine of 2⁴⁸ * pi. The best double approximation of pi, times 2^48, is 884279719003555, which happens to also be the best double approximation of 2⁴⁸π. The actual value of 2⁴⁸π is 884279719003555.03447074. Mod-reducing the best double approximation the former value by the best double approximation of pi would yield zero, the sine of which equals the correct sine of 2⁴⁸π. Mod-reducing by π the value scaled up by the best approximation of pi will yield -0.03447074, the sine of which is -0.03446278.