[R] random uniform sample of points on an ellipsoid (e.g. WG

Sun Feb 25 23:09:53 CET 2007

On 25-Feb-07 Ranjan Maitra wrote:
> "My" method is for the surface, not for the interior. The
> constraint d*X/|| \Gamma^{-1/2}X ||ensures the constraint, no?
> The uniformity is ensured by the density restricted to satisfy
> the constraint which makes it a constant.
> 
> Ranjan

OK -- I admit that I misread your notation in the first instance.
I now see that it generates points which lie on the surface of
the ellipsoid. Apologies!

However, your suggested procedure (reprodiced below) does not do
the required job of generating points which are uniformly
distributed over the surface/circumference of the ellipsoid/ellipse.

I have verified this empirically and mathematically for the ellipse
(see below).

> On Sat, 24 Feb 2007 22:49:25 -0000 (GMT) (Ted Harding)
> <ted.harding at nessie.mcc.ac.uk> wrote:
> 
>> On 24-Feb-07 Ranjan Maitra wrote:
>> > Hi,
>> > Sorry for being a late entrant to this thread, but let me see
>> > if I understand the problem.
>> > 
>> > The poster wants to sample from an ellipsoid. Let us call this
>> > ellipsoid X'\Gamma X - d^2= 0.
>> > 
>> > There is no loss in assuming that the center is zero, otherwise
>> > the same can be done.
>> > 
>> > Let us consider the case when Gamma = I first. 
>> > 
>> > Then, let X \sim N_p(0, I) (any radially symmetric distribution
>> > will do here), then d*X/||X|| is uniform on the sphere of
>> > radius d. 
>> > 
>> > How about imitating the same?
>> > 
>> > Let X \sim N_p(0, \Sigma), where \Sigma = \Gamma^{-1} then X
>> > restricted to X'\Gamma X = d^2 gives the required uniform
>> > density on the ellipsoid.
>> > 
>> > How do we get this easily? I don't think rejection works or
>> > is even necessary.
>> > 
>> > Isn't d*X / ||\Gamma^{1/2}X|| enough? Here \Gamma^{1/2} is
>> > the square root matrix of \Gamma.
>> > 
>> > Note that any distribution of the kind f(X'\Gamma X) would work,
>> > but the multivariate Gaussian is a convenient tool, for which
>> > two-lines of R code should be enough.

A. Verification that it does not generate uniformly distributed
   points along the circumference.

I am considering an ellipse, with vertical minor axis of length 2,
and horizontal minor axis of length 2*a = 10 (so a = 5).

Its equation is (x^2)/(a^2) + y^2 = 1 (so your d^2 = 1). Hence
(in your notation)

  Gamma = matrix(c(1/(a^2),0,0,1),nrow=2),

and

  Sigma = matrix(c(a^2,0,0,1), nrow=2).

The following R code generates the bivariate Z (I'm using Z instead
of your X in X ~ N(0,Sigma) because I want to discuss Z = (X,Y) later):

library(MASS)

a <- 5
Gamma    <- matrix(c(1(a^2),0,0,1),nrow=2)
Gamma0.5 <- matrix(c(1/a   ,0,0,1),nrow=2)
Sigma <- matrix(c(a^2,0,0,1),nrow=2) ## Same as Gamma^(-1)

plota <- function(u,v,a){
  print(plot(u,v,pch="+",xlim=c(-a,a),ylim=c(-a,a)))
}

N <- 20000
X <- mvrnorm(N,c(0,0),Sigma)

### Now I implement your "X restricted to X'*Gamma*X = d^2"
### leading up to your "d*X / ||\Gamma^{1/2}X||":

Y <- X %*% Gamma0.5
NormY <- sqrt(Y[,1]^2 + Y[,2]^2)
Z     <- cbind(X[,1]/NormY,X[,2]/NormY)

### And now it can be verified that these points indeed lie
### on the ellipsoid:

plota(Z[,1],Z[,2],a)
min((Z[,1]^2)/(a^2) + Z[,2]^2)
### [1] 1
max((Z[,1]^2)/(a^2) + Z[,2]^2)
### [1] 1

### Now, however, I dissect the circumference of the ellipsoid
### into segments of equal angular increment relative to the centre,
### count the numbers of the above points in each segment, and
### compute the resulting estimate of density in each segment
### by dividing the number by the length of the segment (computed
### assuming the segment is approximately a straight line):

M <- 500
t <- 2*pi*(1/M)*(0:M)
dN <- numeric(M)
ds <- numeric(M)

for(i in (1:M)){
  x0    <- a*cos(t[i]); x1 <- a*cos(t[i+1]);
  y0    <-   sin(t[i]); y1 <-   sin(t[i+1]);
  if(x1 < x0){tmp<-x1; x1 <- x0; x0 <- tmp}
  if(y1 < y0){tmp<-y1; y1 <- y0; y0 <- tmp}
  ds[i] <- sqrt((x1-x0)^2 + (y1-y0)^2);
  ix    <- ( (x0 < Z[,1])&(Z[,1] < x1) )&( (y0 < Z[,2])&(Z[,2] < y1) );
  dN[i] <- sum(ix)
}

X11()
plot(t[1:M],dN/ds,pch="+",ylim=c(0,max(dN/ds)))

### showing that there are very clear peaks at t [theta] = 0,
### pi/2, and again at pi == 0 -- i.e. the points are concentrated
### at the ends of the long axis (exactly as was the case with the
### method given on the URL that Roger Bivand suggested ("untested"!).
### The height of the peaks is 5 times the minimum, which is
### compatible with a uniform distribution on a circle which has
### been stretched sideways by a linear factor of 5 (see previously
### posted comments).

B. Theoretically wrong.

[NOTE: For the theoretical result used at the end, below, there is
 a useful note by Dave Rusin on the Web at

   http://www.math.niu.edu/~rusin/known-math/95/ellipse.rand

 I have made a PDF version of this, which is easier to read,
 if anyone would like a copy]

Suppose we generate Z = [X,Y] ~ N(0,Sigma) as above. Then

  "Gamma^{1/2}X" = [X/a, Y]

and

  "||\Gamma^{1/2}X||" = sqrt((X/a)^2 + Y^2) = K, say

and (since your d = 1)

  "d*X / ||\Gamma^{1/2}X||" = [X/K, Y/K]

Hence, if theta is the angular coordinate of [X,Y],

  tan(theta) = Y/X = (1/a)*Y/(X/a)

But now X/a, and Y, can be considered independent N(0,1) random
variables, and the angular coordinate, phi say, of [X/a,Y] will
be uniformaly distributed over [0, 1*pi].

**>Hence a*tan(theta) = tan(phi) where phi is uniformaly distributed
in your construction.<**

Now the NOTE given above derives a method of sampling an angle phi
from a non-uniform distribution such that X = a*cos(phi) and
Y = sin(phi) is uniformly distributed on the circumference of an
ellipse. For the present ellipse (X^2)/(a(^2) + Y^2 = 1, a direct
derivation is easy.

Let X = a*cos(phi), Y = sin(phi), so that (X^2(/(a^2) + Y^2 = 1.
Note that phi is not the angular coordinate of (X,Y), but it is
the angular coordiinate  of (X/a,Y), just like phi for your
method above.

For an element of length ds along the circumference,

  ds = sqrt(dx^2 + dy^2) = sqrt(a^2 * sin(phi)^2 + cos(phi)^2)*d.phi

     = sqrt(a^2 - (a^2 - 1)*cos(phi)^2)*d.phi

Since s is uniformly distributed, the probability density of phi
is given by

   f(phi) = sqrt( a^2 - (a^2 - 1)*cos(phi)^2 ) [****]

a non-uniform distribution, which is incompatible with the uniform
distribution of phi (see **>...<** above) in your method.

C. The flaw in your argument?

I think this arises when you say "The uniformity is ensured by the
density restricted to satisfy the constraint which makes it a constant".
Your construction does not restrict (condition) the point to lie
on the circumference: it projects the points from the interior
onto the circumference, which is a quite different matter.

D. How to proceed?

It is clear that this is not a straightforward thing to achieve.
Adopting the Dave Rusin method of sampling from the distribution
[***] of phi requires implementing a routine to geretae random
numbers from what one might call an "incomplete elliptic integral"
distribution. According to my search today of the R site, there
is as yet no explicit reference anywhere to "elliptic integral".
Something relevant may occur under some other name, of course;
but I have not located it.

And this is only for the ellipse! Next comes the ellpsoid ... !!

Best wishes to all,
Ted.

--------------------------------------------------------------------
E-Mail: (Ted Harding) <Ted.Harding at manchester.ac.uk>
Fax-to-email: +44 (0)870 094 0861
Date: 25-Feb-07                                       Time: 22:06:13
------------------------------ XFMail ------------------------------