[R] [plyr] Question regarding ddply: use of .(as.name(varname)) and varname in ddply function

jim holtman jholtman at gmail.com
Mon Dec 6 12:05:47 CET 2010


Here is another approach to try:

> require(data.table)
> var <- "g10"
> df <- data.table(df)
> str(df)
Classes ‘data.table’ and 'data.frame':  6 obs. of  5 variables:
 $ g10: int  1 1 1 10 10 10
 $ l1 : num  0.41 0.607 0.64 -1.478 -1.482 ...
 $ d1 : num  0.918 0.959 0.773 0.474 0.591 ...
 $ l13: num  0.08037 -0.29174 -0.00191 0.29589 0.61538 ...
 $ d13: num  -1.408 -1.275 -1.412 0.709 0.276 ...
> df[,list(min=min(d1), max = max(d1)), by = eval(var)]
     g10        min       max
[1,]   1 0.77292857 0.9592568
[2,]  10 0.04486293 0.5905809


On Mon, Dec 6, 2010 at 4:58 AM, Sunny Srivastava
<research.baba at gmail.com> wrote:
> Dear R-Helpers:
>
> I am using trying to use *ddply* to extract min and max of a particular
> column in a data.frame. I am using two different forms of the function:
>
>
> ## var_name_to_split is a string -- something like "var1" which is the name
> of a column in data.frame
>
> ddply( df, .(as.name(var_name_to_split)), function(x) c(min(x[ , 3] , max(x[
> , 3]))) ## fails with an error - case 1
> ddply( df, var_name_to_split , function(x) c(min(x[ , 3] , max(x[ , 3])))
>               ## works fine - case 2
>
> I can't understand why I get the error in case 1. Can someone help me
> please?
>
> Thank you in advance.
>
> S.
>
> ----------
>
> Here is the reproducible code:
>
> https://gist.github.com/730069
>
> Here is sample data:
>
> structure(list(g10 = c(1L, 1L, 1L, 10L, 10L, 10L), l1 =
> c(0.410077661080032,
> 0.607497980054711, 0.640488621149069, -1.47837849145189, -1.48199933642397,
> -1.42815840788069), d1 = c(0.917769870675383, 0.959256755797054,
> 0.772928570498006, 0.473545787883884, 0.590580940273922, 0.0448629265021484
> ), l13 = c(0.0803696045647364, -0.291741079837731, -0.00191015929550312,
> 0.295889063381279, 0.615383505686296, 0.71991154637985), d13 =
> c(-1.40821713632015,
> -1.27501365601403, -1.41150703235157, 0.708943640186729, 0.276034890463749,
> 0.663383934998686)), .Names = c("g10", "l1", "d1", "l13", "d13"
> ), row.names = c(1L, 2L, 3L, 1758L, 1759L, 1760L), class = "data.frame")
>
>
> -----------
> If some one doesn't want to open github - here is the code
>
> ## Doesn't work
>
> # grp -- name of a column of the the data.frame df
> # function call is -- getMinMax1( df1 , grp = "var1")
>
> getMinMax1 <-function(df, grp){
>      dfret <- ddply( df , .(as.name(grp)), ## I am using
> as.name(grp), source of error
>            function(x){
>                minmax <- c(mix(x[ , 3]), max(x[ ,3]))
>                return(minmax)
>            }
>            )
>      return(dfret)
>  }
>
> ## Works fine
> # grp -- name of a column of the the data.frame df
> # function call is -- getMinMax2( df1 , grp = "var1")
>
> getMinMax2 <-function(df, grp){
>      dfret <- ddply( df , grp, ## using the quoted variable name
> passed to grp when the fun is called
>            function(x){
>                minmax <- c(min(x[ , 3]), max(x[ ,3]))
>                return(minmax)
>            }
>            )
>      return(dfret)
>  }
>
>        [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Jim Holtman
Data Munger Guru

What is the problem that you are trying to solve?



More information about the R-help mailing list