[R] S or R used in natural language processing (NLP)?

John Day jday at csihq.com
Tue Jun 11 15:08:09 CEST 2002

Well, I'm answering my own question to report that there was no response. I 
also posed the same question on the S-News list and received one 
response:  David Smith pointed me to the EMU link on the S-Plus links page: 
http://www.insightful.com/support/links.asp. EMU is a speech data-base for 
storing and analyzing speech waveforms, not really what I was looking for.

Maybe I should have explained in a bit more detail that I am interested in 
the statistical approach to natural language processing for problems like 
text categorization, word sense disambiguation (WSD), text understanding. 
The practitioners in these areas currently use languages like C, Perl, Java 
and tools like MATLAB. Seems like S/R would be a natural here, maybe 
combined with a language like Perl to do the front end parsing and pattern 
matching. (or is S/R capable of matching Perl's abilities?)

I am just learning S/R, so this perhaps an incentive to contribute to this 

John Day

At 09:52 PM 6/8/02 -0400, I wrote:
>Dear All,
>Does anyone use S or R for statistical natural language processing (NLP)?
>All I have found so far is  a package called EMU 
>(http://www.shlrc.mq.edu.au/emu/emu-splus.shtml) which is a speech 
>wave-form processing package.
>What I'm looking for are routines to support text processing, text 
>categorization, word sense disambiguation, text understanding etc.
>In particular, I would like to find a routine in R to perform "maximum 
>entropy" classification.
>(Ref: Nigam,Lafferty,McCallum, "Using Maximum Entropy for Text 
>Classification", http://citeseer.nj.nec.com/nigam99using.html)
>John Day
>PhD Candidate
>Florida Tech
>Melbourne, FL
>r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
>Send "info", "help", or "[un]subscribe"
>(in the "body", not the subject !)  To: r-help-request at stat.math.ethz.ch

r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request at stat.math.ethz.ch

More information about the R-help mailing list