[R] find similar words in text

Boris Steipe boris.steipe at utoronto.ca
Fri Aug 4 00:40:09 CEST 2017


Please keep messages on the list so others can pitch in.

_Which_ words do you want to consider identical for the purpose of frequency count?
_What_ do you want to plot?



B.



> On Aug 3, 2017, at 4:36 PM, Riaan Van Der Walt <Riaan.VanDerWalt at nwu.ac.za> wrote:
> 
> Hallo Boris,
> I've loaded the Rstem, Snowball.
> But I am clueless how to get a list eg. whal* (whale, whales, whaling, whaler, whalers, whaleman, whalemen, whale-ship, whale-boat, whale's) 
> in the book Moby Dick and the frequency of each of the different words.
> I'am usig this script:
>  
> whales1.v <- grep("^whal.*", moby.word.v) 
> whales1.v
>  
> The total occurrence for whal* is 1699.
> But I can't display it or plot it.
>  
> I am new to R and the learning curve is steep!!
>  
> Thx!
> Riaan
> 
> 
> Riaan van der Walt
> Tel / Phone / Mogala : 27+72+2172429
> Email / Epos / Emeile: riaan.vanderwalt at nwu.ac.za 
> Url: http://www.nwu.ac.za/
>  
> >>> Boris Steipe <boris.steipe at utoronto.ca> 31 Jul 2017 23:37 >>>
> You need a stemming algorithm. See here:
>   https://cran.r-project.org/web/views/NaturalLanguageProcessing.html
> 
> Myself, I've had good experience with Rstem.
> 
> B.
> 
> 
> 
> 
> 
> > On Jul 31, 2017, at 4:47 PM, Riaan Van Der Walt <Riaan.VanDerWalt at nwu.ac.za> wrote:
> > 
> > I am new to R.
> > Busy with Text Analysis.
> > 
> > Need a script to find e.g 
> > 
> > whale, whales, whale's, whaler, whalers, whaling,... in Moby Dick
> > 
> > Riaan
> > ______________________________________________
> > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> 
> <Riaan Van Der Walt.vcf>



More information about the R-help mailing list