[R] Help with apply

Doran, Harold HDoran at air.org
Mon Oct 4 20:22:40 CEST 2010


Ahhhh, I see that you wrapped apply around mapply. I was toying with both; but didn't think to use mapply inside apply. As always, thank you, Phil

-----Original Message-----
From: Phil Spector [mailto:spector at stat.berkeley.edu] 
Sent: Monday, October 04, 2010 2:20 PM
To: Doran, Harold
Cc: Gabriela Cendoya; R-help
Subject: Re: [R] Help with apply

Harold -
    The first way that comes to mind is

doit = function(i,j)exp(sum(log((exp(tmp[i,]*(qq$nodes[j]-b))) /
        (factorial(tmp[i,]) * exp(exp(qq$nodes[j]-b)))))) *
        ((1/(s*sqrt(2*pi)))  * exp(-((qq$nodes[j]-0)^2/(2*s^2))))/
        dnorm(qq$nodes[j]) * qq$weights[j]
t(outer(1:3,1:2,Vectorize(doit)))

but you said you wanted to use apply.  That leads to

doit1 = function(tmpi,nod,wt)
       exp(sum(log((exp(tmpi*(nod-b))) / (factorial(tmpi) *
       exp(exp(nod-b)))))) * ((1/(s*sqrt(2*pi)))  *
       exp(-((nod-0)^2/(2*s^2))))/dnorm(nod) * qq$weights[j]
apply(tmp,1,function(x)mapply(function(n,w)doit1(x,n,w),qq$nodes,qq$weights))

Both seem to agree with your rr1.

 					- Phil Spector
 					 Statistical Computing Facility
 					 Department of Statistics
 					 UC Berkeley
 					 spector at stat.berkeley.edu

On Mon, 4 Oct 2010, Doran, Harold wrote:

> Sorry about that; s <- 1
>
> -----Original Message-----
> From: Gabriela Cendoya [mailto:gabrielacendoya.rlist at gmail.com]
> Sent: Monday, October 04, 2010 1:44 PM
> To: Doran, Harold
> Cc: R-help
> Subject: Re: [R] Help with apply
>
> You are missing "s" in your definitions so I can't reproduce your code.
>
>> tmp <- data.frame(var1 = sample(c(0:10), 3, replace = TRUE), var2 = sample(c(0:10), 3, replace = TRUE), var3 = sample(c(0:10), 3, replace = TRUE))
>>
>> str(tmp)
> 'data.frame':   3 obs. of  3 variables:
> $ var1: int  9 3 9
> $ var2: int  4 6 2
> $ var3: int  2 9 3
>>
>> #I can run the following double loop and yield what I want in the end (rr1) as:
>>
>> library(statmod)
>> Q <- 2
>> b <- runif(3)
>> qq <- gauss.quad.prob(Q, dist = 'normal', mu = 0, sigma=1)
>> rr1 <- matrix(0, nrow = Q, ncol = nrow(tmp))
>> L <- nrow(tmp)
>> for(j in 1:Q){
> +     for(i in 1:L){
> +         rr1[j,i] <- exp(sum(log((exp(tmp[i,]*(qq$nodes[j]-b))) /
> (factorial(tmp[i,]) *
> +         exp(exp(qq$nodes[j]-b)))))) * ((1/(s*sqrt(2*pi)))  *
> exp(-((qq$nodes[j]-0)^2/(2*s^2))))/dnorm(qq$nodes[j]) * qq$weights[j]
> +                                                }
> +                                }
> Error: objeto 's' no encontrado
>> rr1
>     [,1] [,2] [,3]
> [1,]    0    0    0
> [2,]    0    0    0
>>
>
> Gabriela
>
>
> 2010/10/4, Doran, Harold <HDoran at air.org>:
>> Suppose I have the following data:
>>
>> tmp <- data.frame(var1 = sample(c(0:10), 3, replace = TRUE), var2 =
>> sample(c(0:10), 3, replace = TRUE), var3 = sample(c(0:10), 3, replace =
>> TRUE))
>>
>> I can run the following double loop and yield what I want in the end (rr1)
>> as:
>>
>> library(statmod)
>> Q <- 2
>> b <- runif(3)
>> qq <- gauss.quad.prob(Q, dist = 'normal', mu = 0, sigma=1)
>> rr1 <- matrix(0, nrow = Q, ncol = nrow(tmp))
>> L <- nrow(tmp)
>>                 for(j in 1:Q){
>>                                                 for(i in 1:L){
>>                                                                 rr1[j,i] <-
>> exp(sum(log((exp(tmp[i,]*(qq$nodes[j]-b))) / (factorial(tmp[i,]) *
>> exp(exp(qq$nodes[j]-b)))))) *
>>
>> ((1/(s*sqrt(2*pi)))  * exp(-((qq$nodes[j]-0)^2/(2*s^2))))/dnorm(qq$nodes[j])
>> * qq$weights[j]
>>                                                 }
>>                                 }
>> rr1
>>
>> But, I think this is so inefficient for large Q and nrow(tmp). The function
>> I am looping over is:
>>
>> fn <- function(x, nodes, weights, s, b) {
>>                 exp(sum(log((exp(x*(nodes-b))) / (factorial(x) *
>> exp(exp(nodes-b)))))) *
>>                 ((1/(s*sqrt(2*pi)))  *
>> exp(-((nodes-0)^2/(2*s^2))))/dnorm(nodes) * weights
>>                 }
>>
>> I've tried using apply in a few ways, but I can't replicate rr1 from the
>> double loop. I can go through each value of Q one step at a time and get
>> matching values in the rr1 table, but this would still require a loop and
>> that I think can be avoided.
>>
>> apply(tmp, 1, fn, nodes = qq$nodes[1], weights = qq$weights[1], s = 1, b =
>> b)
>>
>> Does anyone see an efficient way to compute rr1 without a loop?
>>
>> Harold
>>
>>> sessionInfo()
>> R version 2.10.1 (2009-12-14)
>> i386-pc-mingw32
>>
>> locale:
>> [1] LC_COLLATE=English_United States.1252  LC_CTYPE=English_United
>> States.1252    LC_MONETARY=English_United States.1252
>> [4] LC_NUMERIC=C                           LC_TIME=English_United
>> States.1252
>>
>> attached base packages:
>> [1] stats     graphics  grDevices utils     datasets  methods   base
>>
>> other attached packages:
>> [1] minqa_1.1.9     Rcpp_0.8.6      MiscPsycho_1.6  statmod_1.4.6
>> lattice_0.17-26 gdata_2.8.0
>>
>> loaded via a namespace (and not attached):
>> [1] grid_2.10.1  gtools_2.6.2 tools_2.10.1
>>
>> 	[[alternative HTML version deleted]]
>>
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
>
> -- 
> _________________________
> Lic. María Gabriela Cendoya
> Magíster en Biometría
> Profesor Adjunto
> Facultad de Ciencias Agrarias
> UNMdP - Argentina
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



More information about the R-help mailing list