[R] problem in applying function in data subset (with a level) - using plyr or other alternative are also welcome

David Winsemius dwinsemius at comcast.net
Sat Sep 3 06:28:38 CEST 2011


On Sep 2, 2011, at 11:18 PM, Maya Joshi wrote:

> Dear R experts.
>
> I might be missing something obvious. I have been trying to fix this  
> problem
> for some weeks. Please help.
>
> #data
> ped <- c(rep(1, 4), rep(2, 3), rep(3, 3))
> y <- rnorm(10, 8, 2)
>
> # variable set 1
> M1a <- sample (c(1, 2,3), 10, replace= T)
> M1b <- sample (c(1, 2,3), 10, replace= T)
> M1aP1 <- sample (c(1, 2,3), 10, replace= T)
> M1bP2 <- sample (c(1, 2,3), 10, replace= T)
>
> # variable set 2
> M2a <- sample (c(1, 2,3), 10, replace= T)
> M2b <- sample (c(1, 2,3), 10, replace= T)
> M2aP1 <- sample (c(1, 2,3), 10, replace= T)
> M2bP2 <- sample (c(1, 2,3), 10, replace= T)
>
> # variable set 3
> M3a <- sample (c(1, 2,3), 10, replace= T)
> M3b <- sample (c(1, 2,3), 10, replace= T)
> M3aP1 <- sample (c(1, 2,3), 10, replace= T)
> M3bP2 <- sample (c(1, 2,3), 10, replace= T)
>
> mydf <- data.frame (ped, M1a,M1b,M1aP1,M1bP2, M2a,M2b,M2aP1,M2bP2,
> M3a,M3b,M3aP1,M3bP2, y)
>
> # functions and further calculations
>
> mmat <- matrix
> (c("M1a","M2a","M3a","M1b","M2b","M3b","M1aP1","M2aP1","M3aP1",
> "M1bP2","M2bP2","M3bP2"), ncol = 4)
>
> # first function
> myfun <- function(x) {
> x<- as.vector(x)
> ot1 <- ifelse(mydf[x[1]] == mydf[x[3]], 1, -1)

You really ought to explain what you are trying to do. This code will  
compare two lists. The list from mydf[x2]] will be the column  
mydf["M2b"] which will have as its first element the four element  
vector assigned above. I am guessing that was not what you wanted.  
Notice this simple case using the "[" function as you are attempting  
throws an error:

 > list(M2b = c(1,2,3)) == list(M2a = c(1,2,3))
Error in list(M2b = c(1, 2, 3)) == list(M2a = c(1, 2, 3)) :
   comparison of these types is not implemented'

So ... now that your code has failed to explain what you wanted why  
don't your try some natural language explanations.

Notice that the "[[" function is generally what people want when  
extracting from lists:

 > list(M2b = c(1,2,3))[["M2b"]] == list(M2a = c(1,2,3))[["M2a"]]
[1] TRUE TRUE TRUE


> ot2 <- ifelse(mydf[x[2]] == mydf[x[4]], 1, -1)
> qt <- ot1 + ot2
> return(qt)
> }
> qt <- apply(mmat, 1, myfun)
> ydv <- c((y - mean(y))^2)
> qtd <- data.frame(ped, ydv, qt)
>
> # second function
> myfun2 <- function(dataframe) {
> vydv <- sum(ydv)*0.25
> sumD <- sum(ydv * qt)
> Rt <- vydv / sumD
> return(Rt)
> }
>
> # using plyr
> require(plyr)
> dfsumd1 <- ddply(mydf,.(mydf$ped),myfun2)
>
> Here are 2 issues:
> (1) The output just one, I need the output for all three set of  
> variables
> (as listed above)

An incredibly vague description.

>
> (2)  all three values of dfsumd is returning to same for all level  
> of ped:
> 1,2, 3

I sympathize with those forced to adopt the English language, but that  
is the standard this decade. So givne your apparent difficulties, you  
need to exert more effort at making explicit what is _supposed_ to be  
returned.

> Means that the function is applied to whole dataset but only  
> replicated in
> output !!!
>
> I tried with plyr not being lazy but due to my limited R knowledge,  
> If you
> have a different suggestion, you are welcome too.
>
> Thank you in advance...
>
> Maya
>
> 	[[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

David Winsemius, MD
West Hartford, CT



More information about the R-help mailing list