[R] aggregate text column by a few rows

David Winsemius dwinsemius at comcast.net
Thu Oct 7 18:26:44 CEST 2010


Or:

 > data.frame( hobs= tapply(a$hobby, list( a$name), c))
                  hobs
Tom  fishing, reading
Mary reading, running
John          boating

Note Jim's gives you the names as columns while this has them as  
rownames. Further differences : my version has the column as lists  
whereas Jim's returns them as concatenated strings. Each result may  
have it advantages depending on your applications.

 > dfrm[1,1][[1]][1]
[1] "fishing"
 > "fishing" %in% dfrm[1,1][[1]]
[1] TRUE

With Jim's you could not individually access the hobby items:

 > dfrm2[1,2]
[1] "fishing,reading"
# But you could "grep" them
 > grepl("fishing" , dfrm2[1,2])
[1] TRUE
-- 
David

On Oct 7, 2010, at 12:08 PM, jim holtman wrote:

> try this using sqldf:
>
>> a
>  id name   hobby
> 1  1  Tom fishing
> 2  1  Tom reading
> 3  2 Mary reading
> 4  3 John boating
> 5  2 Mary running
>> require(sqldf)
>> sqldf('select name, group_concat(hobby) hobby from a group by id',  
>> method='raw')
>  name           hobby
> 1  Tom fishing,reading
> 2 Mary reading,running
> 3 John         boating
>
>
> On Thu, Oct 7, 2010 at 11:52 AM, Tan, Richard <RTan at panagora.com>  
> wrote:
>> Hi, R function aggregate can only take summary stats functions, can I
>> aggregate text columns?  For example, for the dataframe below,
>>
>>
>>
>>> a <- rbind(data.frame(id=1, name='Tom',
>> hobby='fishing'),data.frame(id=1, name='Tom',
>> hobby='reading'),data.frame(id=2, name='Mary',
>> hobby='reading'),data.frame(id=3, name='John',
>> hobby='boating'),data.frame(id=2, name='Mary', hobby='running'))
>>
>>> a
>>
>>  id name   hobby
>>
>> 1  1  Tom fishing
>>
>> 2  1  Tom reading
>>
>> 3  2 Mary reading
>>
>> 4  3 John boating
>>
>> 5  2 Mary running
>>
>>
>>
>>
>>
>> I want output as
>>
>>> b
>>
>> id name hobbies
>>
>> 1 Tom    fishing reading
>>
>> 2 Mary reading running
>>
>> 3 John boating
>>
>>
>>
>>
>>
>>
>>
>> Thanks,
>>
>> Richard
>>
>>
>>
>>
>>        [[alternative HTML version deleted]]
>>
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
>
>
> -- 
> Jim Holtman
> Cincinnati, OH
> +1 513 646 9390
>
> What is the problem that you are trying to solve?
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

David Winsemius, MD
West Hartford, CT



More information about the R-help mailing list