[R] split / lapply over multiple columns

Bert Gunter gunter.berton at gene.com
Wed Aug 4 18:35:45 CEST 2010


In general, the lapply(split(...)) construction should never be used.
Use tapply() or by() instead, along the lines of

by(dataframe,with(yourdata,list(your columns)), function(...),...)

If you find this complexity annoying, then look into Hadley Wickham's
plyr package for simpler constructions, albeit with different syntax.
Alternatively, his reshape package (?melt,?cast therein) may also
suffice, depending on what you are trying to do.

Cheers,

Bert Gunter
Genentech Nonclinical Biostatistics



On Tue, Aug 3, 2010 at 9:06 PM, David Winsemius <dwinsemius at comcast.net> wrote:
>
> On Aug 3, 2010, at 10:48 PM, Ralf B wrote:
>
>> Hi all,
>>
>> I have a data frame with column over which I would like to run
>> repeated functions for data analysis. Currently I am only running
>> recursively over two columns where I column 1 has two states over
>> which I split and column two has 3 states. The function therefore runs
>> 2 x 3 = 6 times as shown when running the following code:
>>
>> mydata <- data.frame(userid = c(5, 6, 5, 6, 5, 6), taskid = c(1, 1, 2, 2,
>> 3, 3),
>>     stuff = 11:16)
>> mydata
>> mydata <- mydata[with(mydata, order(userid, taskid)), ]
>> mydata
>>
>> lapply(split(mydata, mydata[,1]), function(x){
>>        lapply(split(x, x[,2]), function(y){
>>                print(paste("result:",y))
>>        })
>> })
>>
>> This traverses the tree like this:
>>
>> 5,1
>> 5,2
>> 5,3
>> 6,1
>> 6,2
>> 6,3
>>
>> Is there an easier way of doing that? I would like to provide the two
>> columns (index 1 and index 2) directly and have the ?lapply function
>> perform its lambda function directly on each memebr of the tree
>> automatically? How can I do that?
>
> split(mydata, with(mydata, paste(userid, taskid, sep=".")))
>
> Perhaps something like:
>
> lapply( split(mydata, with(mydata, paste(userid, taskid, sep="."))),
> function(x) paste("result:", x))
>
>>
>
> David Winsemius, MD
> West Hartford, CT
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



More information about the R-help mailing list