[R] Word Frequency for each row

Rui Barradas ruipbarradas at sapo.pt
Fri Mar 8 23:50:30 CET 2013


Hello,

I had thought of something like that, but I'm not sure if the match must 
be exact. If not, grep seems better. More complicated and slower but 
more flexible.

Rui Barradas

Em 08-03-2013 21:32, arun escreveu:
>
>
> Hi,
> You can also try:
>     res2<-rowSums(x==word)
>
> res1<-sapply(where,length)
> res1[]<- sapply(res1,as.numeric)
>   identical(res1,res2)
> #[1] TRUE
> A.K.
>
>
>
> ----- Original Message -----
> From: Rui Barradas <ruipbarradas at sapo.pt>
> To: Sudip Chatterjee <sudipanalyst at gmail.com>
> Cc: r-help at r-project.org
> Sent: Friday, March 8, 2013 4:26 PM
> Subject: Re: [R] Word Frequency for each row
>
> Hello,
>
> I'm not sure I understand, but see if the following is an example of
> counting occurences of a word in each row.
>
>
> set.seed(1855)
> x <- matrix(sample(LETTERS[1:5], 400, replace = TRUE), ncol = 4)
> word <- "A"
> where <- apply(x, 1, function(.x) grep(word, .x))
> sapply(where, length)  # count them
>
>
> Hope this helps,
>
> Rui Barradas
>
> Em 08-03-2013 16:04, Sudip Chatterjee escreveu:
>> Hi All,
>>
>>     I am wondering if there is any examples where you can count your
>> interested "word" in each row. For an example if  you have data with *'ID*'
>> and '*write-up*' for 100 rows, how would I calculate the word frequency for
>> each row ?
>>
>>      Thank you for all your time.
>>
>>      [[alternative HTML version deleted]]
>>
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



More information about the R-help mailing list