[R] R 3.1.2 using a custom function in aggregate() function on Windows 7 OS 64bit

Jeff Newmiller jdnewmil at dcn.davis.CA.us
Thu Mar 5 16:54:16 CET 2015


The aggregate function applies FUN to vectors, not data frames. For example, the default "mean" function accepts a vector such as a column in a data frame and returns a scalar (well, a vector of length 1). Aggregate then calls this function once for each piece of the column(s) you give it. Your function wants two vectors, but aggregate does not understand how to give two inputs.

(In the future, please follow R-help mailing list guidelines and post using plain text so your code does not get messed up.)

You could use split to break your data frame into a list of data frames, and then sapply to extract the results you are looking for. I prefer to use the plyr or dplyr or data.table packages to do all this for me.

d_rule <- function( DF ) {
  i <- which( DF$a==max( DF$a ) )
  if ( length( i ) == 1 ){
    DF[ i, "x" ] 
  } else {
    min( DF[ , "x" ] ) # did you mean min( DF$x[i] ) ?
  }
}

dat <- data.frame( a=c(2,2,1,4,2,5,2,3,4,4)
    , x = c(1:10)
    , g = c(1,1,2,2,3,3,4,4,5,5)
    )
# note that cbind on vectors creates a matrix
# in a matrix all columns must be of the same type
# but data frames generally have a variety of types
# so don't use cbind when making a data frame

library( dplyr )

result <- dat %>% group_by( g ) %>% do( answer = d_rule( . ) ) %>% as.data.frame

---------------------------------------------------------------------------
Jeff Newmiller                        The     .....       .....  Go Live...
DCN:<jdnewmil at dcn.davis.ca.us>        Basics: ##.#.       ##.#.  Live Go...
                                      Live:   OO#.. Dead: OO#..  Playing
Research Engineer (Solar/Batteries            O.O#.       #.O#.  with
/Software/Embedded Controllers)               .OO#.       .OO#.  rocks...1k
--------------------------------------------------------------------------- 
Sent from my phone. Please excuse my brevity.

On March 4, 2015 2:02:06 PM PST, Typhenn Brichieri-Colombi via R-help <r-help at r-project.org> wrote:
>Hello,
>
>I am trying to use the following custom function in an
>aggregatefunction, but cannot get R to recognize my data. I’ve read the
>help on function()and on aggregate() but am unable to solve my problem.
>How can I get R torecognize the data inputs for the custom function
>nested within aggregate()?
>
>My custom function is found below, as well as the errormessage I get
>when I run it on a test data set (I will be using this functionon a
>much larger dataset (over 600,000 rows)) 
>
>Thank you for your time and your help!
>
>
> 
>d_rule<-function(a,x){ 
>
>i<-which(a==max(a))
>
>out<-ifelse(length(i)==1, x[i], min(x))
>
>return(out)
>
>}
>
>
> 
>a<-c(2,2,1,4,2,5,2,3,4,4)
>
>x<-c(1:10)
>
>g<-c(1,1,2,2,3,3,4,4,5,5)
>
>dat<-as.data.frame(cbind(x,g))
>
>
> 
>test<-aggregate(dat, by=list(g), FUN=d_rule,dat$a, dat$x)
>
>Error in dat$x : $ operator is invalid for atomic vectors
>
>
>
>	[[alternative HTML version deleted]]
>
>______________________________________________
>R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide
>http://www.R-project.org/posting-guide.html
>and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list