[R] Data manipulation with aggregate

jim holtman jholtman at gmail.com
Tue Jul 3 23:05:23 CEST 2012


try this:

> myData = data.frame(Name = c('a', 'a', 'b', 'b'), length = c(1,2,3,4), type
+ = c('x','x','y','z'))
>
> result <- do.call(rbind, lapply(split(myData, myData$Name), function(.name){
+ data.frame(Name = .name$Name[1L]
+ , length = mean(.name$length)
+ , type = if (all(.name$type[1L] == .name$type)) .name$type[1L] else NA
+ )
+ })
+ )
> result
  Name length type
a    a    1.5    x
b    b    3.5 <NA>
>



On Tue, Jul 3, 2012 at 12:04 PM, Filoche <pmassicotte at hotmail.com> wrote:
> Hi everyone.
>
> I have these data :
>
> myData = data.frame(Name = c('a', 'a', 'b', 'b'), length = c(1,2,3,4), type
> = c('x','x','y','z'))
>
> which gives me:
>
>   Name length type
> 1    a      1    x
> 2    a      2    x
> 3    b      3    y
> 4    b      4   z
>
> I would group (mean) this DF using 'Name' as grouping factor. However, I
> have a field ('type') which is a string. I would like to use the unique
> value of this field when possible (i.e. when all the 'type' values are the
> same for each group) or replace with NA when 'type' has multiple values.
>
> In fact, I would like to obtain this:
>
>   Name length type
> 1    a      1.5    x
> 2    b      3.5    NA
>
> For instance, I was using this command:
>
> aggregate(list(myData$length, myData$type), list(myData$Name), FUN = mean)
>
> But it can't deal with string data.
>
> I hope I have been clear enough.
>
> With regards,
> Phil
>
> --
> View this message in context: http://r.789695.n4.nabble.com/Data-manipulation-with-aggregate-tp4635298.html
> Sent from the R help mailing list archive at Nabble.com.
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



-- 
Jim Holtman
Data Munger Guru

What is the problem that you are trying to solve?
Tell me what you want to do, not how you want to do it.



More information about the R-help mailing list