[R] R 3.1.2 using a custom function in aggregate() function on Windows 7 OS 64bit

Jeff Newmiller jdnewmil at dcn.davis.CA.us
Thu Mar 5 17:55:07 CET 2015


I don't see your point. No matter which version of aggregate you use, FUN is applied to vectors. Those vectors may be columns in a data frame or not, but FUN is always given one vector at a time by aggregate.
---------------------------------------------------------------------------
Jeff Newmiller                        The     .....       .....  Go Live...
DCN:<jdnewmil at dcn.davis.ca.us>        Basics: ##.#.       ##.#.  Live Go...
                                      Live:   OO#.. Dead: OO#..  Playing
Research Engineer (Solar/Batteries            O.O#.       #.O#.  with
/Software/Embedded Controllers)               .OO#.       .OO#.  rocks...1k
--------------------------------------------------------------------------- 
Sent from my phone. Please excuse my brevity.

On March 5, 2015 8:12:39 AM PST, Bert Gunter <gunter.berton at gene.com> wrote:
>Sorry, Jeff. aggregate() is generic.
>
>>From ?aggregate:
>
>"## S3 method for class 'data.frame'
>aggregate(x, by, FUN, ..., simplify = TRUE)"
>
>Cheers,
>Bert
>
>Bert Gunter
>Genentech Nonclinical Biostatistics
>(650) 467-7374
>
>"Data is not information. Information is not knowledge. And knowledge
>is certainly not wisdom."
>Clifford Stoll
>
>
>
>
>On Thu, Mar 5, 2015 at 7:54 AM, Jeff Newmiller
><jdnewmil at dcn.davis.ca.us> wrote:
>> The aggregate function applies FUN to vectors, not data frames. For
>example, the default "mean" function accepts a vector such as a column
>in a data frame and returns a scalar (well, a vector of length 1).
>Aggregate then calls this function once for each piece of the column(s)
>you give it. Your function wants two vectors, but aggregate does not
>understand how to give two inputs.
>>
>> (In the future, please follow R-help mailing list guidelines and post
>using plain text so your code does not get messed up.)
>>
>> You could use split to break your data frame into a list of data
>frames, and then sapply to extract the results you are looking for. I
>prefer to use the plyr or dplyr or data.table packages to do all this
>for me.
>>
>> d_rule <- function( DF ) {
>>   i <- which( DF$a==max( DF$a ) )
>>   if ( length( i ) == 1 ){
>>     DF[ i, "x" ]
>>   } else {
>>     min( DF[ , "x" ] ) # did you mean min( DF$x[i] ) ?
>>   }
>> }
>>
>> dat <- data.frame( a=c(2,2,1,4,2,5,2,3,4,4)
>>     , x = c(1:10)
>>     , g = c(1,1,2,2,3,3,4,4,5,5)
>>     )
>> # note that cbind on vectors creates a matrix
>> # in a matrix all columns must be of the same type
>> # but data frames generally have a variety of types
>> # so don't use cbind when making a data frame
>>
>> library( dplyr )
>>
>> result <- dat %>% group_by( g ) %>% do( answer = d_rule( . ) ) %>%
>as.data.frame
>>
>>
>---------------------------------------------------------------------------
>> Jeff Newmiller                        The     .....       .....  Go
>Live...
>> DCN:<jdnewmil at dcn.davis.ca.us>        Basics: ##.#.       ##.#.  Live
>Go...
>>                                       Live:   OO#.. Dead: OO#.. 
>Playing
>> Research Engineer (Solar/Batteries            O.O#.       #.O#.  with
>> /Software/Embedded Controllers)               .OO#.       .OO#. 
>rocks...1k
>>
>---------------------------------------------------------------------------
>> Sent from my phone. Please excuse my brevity.
>>
>> On March 4, 2015 2:02:06 PM PST, Typhenn Brichieri-Colombi via R-help
><r-help at r-project.org> wrote:
>>>Hello,
>>>
>>>I am trying to use the following custom function in an
>>>aggregatefunction, but cannot get R to recognize my data. I’ve read
>the
>>>help on function()and on aggregate() but am unable to solve my
>problem.
>>>How can I get R torecognize the data inputs for the custom function
>>>nested within aggregate()?
>>>
>>>My custom function is found below, as well as the errormessage I get
>>>when I run it on a test data set (I will be using this functionon a
>>>much larger dataset (over 600,000 rows))
>>>
>>>Thank you for your time and your help!
>>>
>>>
>>>
>>>d_rule<-function(a,x){
>>>
>>>i<-which(a==max(a))
>>>
>>>out<-ifelse(length(i)==1, x[i], min(x))
>>>
>>>return(out)
>>>
>>>}
>>>
>>>
>>>
>>>a<-c(2,2,1,4,2,5,2,3,4,4)
>>>
>>>x<-c(1:10)
>>>
>>>g<-c(1,1,2,2,3,3,4,4,5,5)
>>>
>>>dat<-as.data.frame(cbind(x,g))
>>>
>>>
>>>
>>>test<-aggregate(dat, by=list(g), FUN=d_rule,dat$a, dat$x)
>>>
>>>Error in dat$x : $ operator is invalid for atomic vectors
>>>
>>>
>>>
>>>       [[alternative HTML version deleted]]
>>>
>>>______________________________________________
>>>R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
>>>https://stat.ethz.ch/mailman/listinfo/r-help
>>>PLEASE do read the posting guide
>>>http://www.R-project.org/posting-guide.html
>>>and provide commented, minimal, self-contained, reproducible code.
>>
>> ______________________________________________
>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list