[R] bivariate normal distribution in likelihood

Prof Brian Ripley ripley at stats.ox.ac.uk
Tue May 1 09:09:34 CEST 2007


On Tue, 1 May 2007, Robert A LaBudde wrote:

> At 11:32 PM 4/30/2007, Deepankar wrote:
>> Hi all,
>>
>> I am trying to do a maximum likelihood estimation. In my likelihood
>> function, I have to evaluate a double integral of the bivariate
>> normal density over a subset of the points of the plane. For
>> instance, if I denote by y the y co-ordinate and by x, the x
>> co-ordinate then the area over which I have to integrate is the
>> following set of points, A:
>> A = {x>=0 & y < 3x+10}
>>
>> I have used the following code to calculate this double integral:
>>
>> x <- rmvnorm(100000, mean=me, sigma=sig)
>> c <- nrow(x)
>> x1 <- x[ ,1]
>> x2 <- x[ ,2]
>> e1 <- as.numeric(x2 < 3*x1 + 10 & x1>0)
>> p1 <- sum(e1)/c
>>
>> In this code, I provide the mean and covariance while drawing the
>> bivariate random normal variables and get "p1" as the required
>> answer. The problem is that I have to draw at least 100,000
>> bivariate random normals to get a reasonable answer; even then it is
>> not very accurate.
>>
>> Is there some other way to do the same thing more accurately and
>> more efficiently? For instance, can this be done using the bivariate
>> normal distribution function "pmvnorm"? Also feel free to point our
>> errors if you see one.
>
> Simple random sampling is a poor way to evaluate an integral
> (expectation). It converges on the order of 1/sqrt(N).

Which is no worse than other general schemes in high dimensions or 
without smoothness assumptions on the integrand.

> Stratified random sampling would be better, as it converges on the
> order of 1/N.

Your reference for this result, please.  (As stated it is untrue, so I 
presume the stratification scheme depends on N and there are smoothness 
assumptions.)  (BTW, the reference for the results I quote is 'Stochastic 
Simulation' (1987).)

We have not been told 'me' and 'sig', and depending on their values it is 
quite possible that importance sampling would do a great deal better than 
sampling from the specified bivariate normal.

> Even better is product Gauss-Hermite quadrature which will give a
> very accurate answer with a few dozen points.

This is a correlated bivariate normal, and product quadrature methods can 
be arbitrarily bad for such integrals (as people find out for mixed linear 
models).

What you can do is transform to a pair of uncorrelated normals, and for a 
set of the form A as given this transforms to a similar form.  And for 
that an iterated integral can be done easily as the inner integral over y 
will just be a call to pnorm.

Specifically, there is another normal z such that x and z are independent 
and y = z + a*x for some a.  Then A = {x > 0 & z < (3-x)*x + 10} can be 
integrated over z conditional on x and then over x.


> ================================================================
> Robert A. LaBudde, PhD, PAS, Dpl. ACAFS  e-mail: ral at lcfltd.com
> Least Cost Formulations, Ltd.            URL: http://lcfltd.com/
> 824 Timberlake Drive                     Tel: 757-467-0954
> Virginia Beach, VA 23464-3239            Fax: 757-467-2947
>
> "Vere scire est per causas scire"
>
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

-- 
Brian D. Ripley,                  ripley at stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595



More information about the R-help mailing list