[R] Unicode Text Segmentation Algorithms already implemented in R?
istazahn at gmail.com
Thu Mar 3 14:44:39 CET 2016
You searched, but did not tell us what you found, nor why it was unsuitable
for you undescribed use case. So all we can do is guess: my guess is
On Mar 3, 2016 8:14 AM, "Sascha Wolfer" <wolfer at ids-mannheim.de> wrote:
> Hello list members,
> I am looking for an implementation of Unicode text segmentation (word
> boundary detection) algorithms in R. You can find information about the
> algorithms here: http://www.unicode.org/reports/tr29/#Word_Boundaries
> The help page for the function ‚casefuns‘ from the excellent ‚Unicode‘
> package says: "Other methods will be added eventually (once the Unicode
> text segmentation algorithm is implemented for detecting word boundaries).“
> My simple question is: Are these algorithms already implemented in an R
> package? I didn’t find anything on the web, but I am counting on the power
> of this list. My Stata-using colleague is already picking at me… (in Stata,
> the function ’ustrword’ does exactly what I want to do in R).
> Thanks for your help, have a good day, you all!
> Sascha W.
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> PLEASE do read the posting guide
> and provide commented, minimal, self-contained, reproducible code.
[[alternative HTML version deleted]]
More information about the R-help