[R] more complex by with data.table???

Arunkumar Srinivasan aragorn168b at gmail.com
Sun Jun 21 20:55:10 CEST 2015


Ramiro,

`dt[, lapply(.SD, mean), by=name]` is the idiomatic way.

I suggest reading through the new HTML vignettes at
https://github.com/Rdatatable/data.table/wiki/Getting-started

Ista, thanks for linking to the new vignette.


On Wed, Jun 10, 2015 at 2:17 AM, Ista Zahn <istazahn at gmail.com> wrote:
> Hi Ramiro,
>
> There is a demonstration of this on the data.table wiki at
> https://rawgit.com/wiki/Rdatatable/data.table/vignettes/datatable-intro-vignette.html.
> You can do
>
> dt[, lapply(.SD, mean), by=name]
>
> or
>
> dt[, as.list(colMeans(.SD)), by=name]
>
> BTW, there are pretty straightforward ways to do this in base R as well, e.g,
>
> data.frame(t(sapply(split(df[-1], df$name), colMeans)))
>
> Best,
> Ista
>
> On Tue, Jun 9, 2015 at 4:22 PM, Ramiro Barrantes
> <ramiro at precisionbioassay.com> wrote:
>> Hello,
>>
>> I am trying to do something that I am able to do with the "by" function within data.frame but can't figure out how to achieve with data.table.
>>
>> Consider
>>
>> dt<-data.table(name=c(rep("a",5),rep("b",6)),var1=0:10,var2=20:30,var3=40:50)
>> myFunction <- function(x) { mean(x) }
>>
>> I am aware that I can do something like:
>>
>> dt[, .(meanVar1=myFunction(var1)) ,by=.(name)]
>>
>> but how could I do the equivalent of:
>>
>> df<-data.frame(name=c(rep("a",5),rep("b",6)),var1=0:10,var2=20:30,var3=40:50)
>> myFunction <- function(x) { mean(x) }
>>
>> columnNames <- c("var1","var2","var3")
>> result <- by(df, df$name, function(x) {
>>    output <- c()
>>    for(col in columnNames) {
>>      output[col] <- myFunction(x[,col])
>>    }
>>   output
>> })
>> do.call(rbind,result)
>>
>> Thanks in advance,
>> Ramiro
>>
>>         [[alternative HTML version deleted]]
>>
>> ______________________________________________
>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list