[R] word frequency count

Uwe Ligges ligges at statistik.tu-dortmund.de
Sun Mar 18 15:30:48 CET 2012



On 18.03.2012 14:31, mail me wrote:
> Hi:
>
> Suppose I create the dataframe df using the following code:
>
> df<- data.frame( item1 = c('milk',
> 'bread','beer','beer','milk','beer'), item2 =c('bread',
> 'butter','diaper','diaper','bread', 'diaper'), stringsAsFactors = F);
>
>
> df
>
>   item1  item2
> 1  milk  bread
> 2 bread butter
> 3  beer diaper
> 4  beer diaper
> 5  milk  bread
> 6  beer diaper
>
> And now i want the following output:
>
> milk,bread   2
> bread,butter 1
> beer,diaper  3
> milk,bread   2

Why do you want "milk,bread" twice?


> and "milk,bread" is a single datum. I hope this clarifies the problem!


If you don't want milk,bread twice, I'd go with:

table(apply(df, 1, paste, collapse=","))

Uwe Ligges


> Thanks!
>
>
>
> On 3/18/12, John Kane<jrkrideau at inbox.com>  wrote:
>> ? table
>>
>> First however confirm "that milk,bread" is a single datum. str() should do
>> this
>>
>> Can you post a sample of the data here using dput()?
>>
>> John Kane
>> Kingston ON Canada
>>
>>
>>> -----Original Message-----
>>> From: mailme842 at googlemail.com
>>> Sent: Sun, 18 Mar 2012 13:12:48 +0200
>>> To: r-help at r-project.org
>>> Subject: [R] word frequency count
>>>
>>> Hi:
>>>
>>> I have a dataframe containing comma seperated group of words such as
>>>
>>> milk,bread
>>> bread,butter
>>> beer,diaper
>>> beer,diaper
>>> milk,bread
>>> beer,diaper
>>>
>>> I want to output the frequency of occurrence of comma separated words
>>> for each row and collapse duplicate rows, to make the output as shown
>>> in the following dataframe:
>>>
>>> milk,bread   2
>>> bread,butter 1
>>> beer,diaper  3
>>> milk,bread   2
>>>
>>> Thanks for help!
>>>
>>> deb
>>>
>>> ______________________________________________
>>> R-help at r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide
>>> http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>
>> ____________________________________________________________
>> FREE 3D MARINE AQUARIUM SCREENSAVER - Watch dolphins, sharks&  orcas on your
>> desktop!
>> Check it out at http://www.inbox.com/marineaquarium
>>
>>
>>
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list