[R] Help in R

Charles C. Berry cberry at tajo.ucsd.edu
Wed Dec 16 03:49:36 CET 2009


On Tue, 15 Dec 2009, li li wrote:

> Thanks for the hint. I wrote the following:
>
>> n <- 10
>> a <- 1
>> b<- 2
>> set.seed(1)
>> y <- rnorm(n)
>>
>> xmat <- as.matrix( expand.grid( rep( list(0:1), n ) ))
>>
>> s <- numeric(2^(n))
>> for (i in 1:2^(n)){
> + s[i]<- beta(a+rowSums(xmat)[i], b+n-rowSums(xmat)[i])*
> + prod((1-xmat[i,])*dnorm(y,0,1)+xmat[i,]*dnorm(y,0,1))}
>> sum(s)
> [1] 3.015227e-06
>

This is different than my result. You do not really want

(1-xmat[i,])*dnorm(y,0,1)+xmat[i,]*dnorm(y,0,1) == dnorm(y,0,1)

Still you can vectorize the whole thing. I think I had it wrong in my 
earlier try. I think this is what you are after:

> sum(beta(a+rowSums(xmat), b+n-rowSums(xmat))*exp(colSums( 
+     (1-t(xmat)) * dnorm(y,0,1,log=T) + t(xmat) * dnorm(y,2,1,log=T))))
[1] 8.414984e-07


> There is still a problem though. when n is large, "expand.grid" seems not to
> work.
>
>> expand.grid( rep( list(0:1), 100 ) )
> Error in rep.int(rep.int(seq_len(nx), rep.int(rep.fac, nx)), orep) :
>  invalid 'times' value
> In addition: Warning message:
> In rep.int(rep.int(seq_len(nx), rep.int(rep.fac, nx)), orep) :
>  NAs introduced by coercion
>
>
> Is there a way to correct it?

I doubt it. But this is not really the fault of expand.grid()

You want to sum 2^100 elements.

Even if you had a computer fast enough to evaluated 2^20 = 1Meg elements 
per second, it would take

> print(2^100/2^20/60/60/24/365.25,digits=20)
[1] 38308547532595288
>

years for the computation to complete.

Astronomers have determined that the universe will cease to exist 
long before your computation would complete!

You need a different approach than direct enumeration of the sum.

HTH,

Chuck

> Thank you.
> 2009/12/15 Charles C. Berry <cberry at tajo.ucsd.edu>
>
>> On Tue, 15 Dec 2009, li li wrote:
>>
>> Hi:
>>>  Thank you for your reply. I realize that I did not state the problem
>>> clearly before.
>>> Here is the problem again.
>>>
>>>  Let X = (X1, X2, ..., Xn) and Y=(Y1,Y2,...,Yn).  Xi's can be 0 or 1.
>>> When Xi=1, Yi is distributed as N(2,1). When Xi=0, Yi is distributed as
>>> N(0,1).
>>> There are 2^n possible X values. For example, x=(0,0,...,0) , i.e. all xi
>>> is
>>> 0.
>>>
>>>  The summand is:
>>>
>>> beta(a+sum(xi), b+n-sum(xi) ) * [ (1-x1) * dnorm(y1, 0, 1) + x1 *
>>> dnorm(y1,
>>> 2, 1) ] *
>>> [(1-x2) * dnorm(y2, 0, 1) + x2 * dnorm(y2, 2, 1) ] * ...*  [ (1-xn) *
>>> dnorm(yn, 0, 1) + xn * dnorm(yn, 2, 1) ]
>>>
>>> where sum(xi)=x1 + x2 + ... + xn.
>>>
>>> For example, when x=(0,0,...,0) , i.e. all xi is 0. The summand is
>>>
>>> beta(a, b+n) * dnorm(y1, 0, 1)  *  dnorm(y2, 0, 1) * ...*  dnorm(yn, 0, 1)
>>>
>>>
>>> I want to take the sum of the summand over all possible X values. There
>>> are
>>> 2^n of them.
>>>
>>> I am wondering whether there is a function in R that can take care of this
>>> kind of sum.
>>>
>>>
>> A function that does exactly that? I do not know of one, but have not
>> searched very hard.
>>
>> If I understood what you want (and "commented, minimal, self-contained,
>> reproducible code" was requested and would have helped me), a simple one
>> line command would give the result.
>>
>> For
>>
>> set.seed(1)
>>> n <- 10
>>> y <- rnorm(n)
>>> x <- rbinom(10,1,.5)
>>>
>>> a <- 1
>>> b <- 2
>>>
>>
>>
>> The answer is given by:
>>
>>> # [ one line command omitted ]
>>>
>> [1] 1.298587e-08
>>
>>>
>>>
>> You will need to understand "vectorization" and "recycling rules" to do
>> this neatly in one line.
>>
>> It will help you to study the sections of the manuals that pertain to those
>> topics (and perhaps the mail ist archive) as well as to study the help and
>> examples for
>>
>>        ?dnorm
>>        ?beta
>>        ?expand.grid
>>        ?matrix
>>        ?rowSums
>>        ?as.matrix
>>
>>
>> And here is a helpful hint
>>
>>        xmat <- as.matrix( expand.grid( rep( list(0:1), n ) )
>>
>> will be useful.
>>
>> If you need more help, be sure to review the Posting Guide and run the
>> function help.request() to help you better frame your question.
>>
>> HTH,
>>
>> Chuck
>>
>> p.s. Is this homework? If so, ask your instructor for help first.
>>
>>
>>   Thank you
>>>
>>>
>>>
>>> 2009/12/15 Dennis Murphy <djmuser at gmail.com>
>>>
>>>   Hi:
>>>>
>>>> Your expression makes no sense in that I don't see where the summation
>>>> can
>>>> obtain 'over all 2^n possible values...' the way you wrote it. How about
>>>> this?
>>>>
>>>> Let x = (x1, x2, ..., xn). The summand can be then be expressed in R as
>>>>
>>>>  beta(a + sum(x), b + n - sum(x)) * ((1 - x) * dnorm(x, 0, 1) + x *
>>>> dnorm(x, 2, 1))   .
>>>>
>>>> The beta function term is a scalar,  the second expression is a vector.
>>>> Do you want the sum of the products of the vector elements? If so, it
>>>> should be as easy as
>>>>
>>>> beta(a + sum(x), b + n - sum(x)) * sum((1 - x) * dnorm(x, 0, 1) + x *
>>>> dnorm(x, 2, 1))
>>>>
>>>> If, on the other hand, you meant to write the summand as
>>>>
>>>> beta(a + x, b + 1 - x) * ((1 - x) * dnorm(x, 0, 1) + x * dnorm(x, 2, 1))
>>>>  ,
>>>>
>>>> then the sum is
>>>>
>>>> sum(beta(a + x, b + 1 - x) * ((1 - x) * dnorm(x, 0, 1) + x * dnorm(x, 2,
>>>> 1))
>>>>
>>>>
>>>> More below...HTH
>>>> Dennis
>>>>
>>>>  On Mon, Dec 14, 2009 at 6:38 PM, li li <hannah.hlx at gmail.com> wrote:
>>>>
>>>> Hello,
>>>>>  Can anyone give me some suggestion in term of calculating the sum
>>>>> below.
>>>>> Is there a function in R that can help doing it faster?
>>>>>
>>>>> x1, x2, ...xn where xi can be 0 or 1. I want to calculate the following:
>>>>>
>>>>> sum{ beta[a+sum(xi), b+n-sum(xi) ]* [ (1-x1)dnorm(0,1)+x1dnorm(2,1) ]*
>>>>>  [
>>>>> (1-x2)dnorm(0,1)+x2dnorm(2,1) ]* ...* [ (1-xn)dnorm(0,1)+xndnorm(2,1) ]
>>>>> }
>>>>>
>>>>>
>>>> The problem I have is that you're taking the product of the n normal
>>>> mixtures and then
>>>> you somehow want to sum over them; if
>>>> this is meant to be a likelihood function, then it seems that you would
>>>> want
>>>>
>>>>  beta(a+sum(x), b+n-sum(x)) * prod((1 - x) * dnorm(x, 0, 1) + x *
>>>> dnorm(x,
>>>> 2, 1))
>>>>
>>>> instead. As I said up front, it's not clear what you really want, so I
>>>> leave you with
>>>> multiple possibilities in the hope that one of them is what you need...
>>>>
>>>>
>>>>   The sum in the beginning is over all 2^n possible values for the
>>>>> vector
>>>>> x1,
>>>>> x2, ...xn .
>>>>>
>>>>>  Thank you very much!
>>>>>
>>>>>                                               Hannah
>>>>>
>>>>>       [[alternative HTML version deleted]]
>>>>>
>>>>> ______________________________________________
>>>>> R-help at r-project.org mailing list
>>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>>> PLEASE do read the posting guide
>>>>> http://www.R-project.org/posting-guide.html<http://www.r-project.org/posting-guide.html>
>>>>> <http://www.r-project.org/posting-guide.html>
>>>>>
>>>>> and provide commented, minimal, self-contained, reproducible code.
>>>>>
>>>>>
>>>>
>>>>
>>>        [[alternative HTML version deleted]]
>>>
>>> ______________________________________________
>>> R-help at r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide
>>> http://www.R-project.org/posting-guide.html<http://www.r-project.org/posting-guide.html>
>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>>>
>> Charles C. Berry                            (858) 534-2098
>>                                            Dept of Family/Preventive
>> Medicine
>> E mailto:cberry at tajo.ucsd.edu               UC San Diego
>> http://famprevmed.ucsd.edu/faculty/cberry/  La Jolla, San Diego 92093-0901
>>
>>
>>
>

Charles C. Berry                            (858) 534-2098
                                             Dept of Family/Preventive Medicine
E mailto:cberry at tajo.ucsd.edu	            UC San Diego
http://famprevmed.ucsd.edu/faculty/cberry/  La Jolla, San Diego 92093-0901




More information about the R-help mailing list