[R] bayesian text classification...
Peter Dalgaard BSA
p.dalgaard at biostat.ku.dk
Tue Jan 21 09:35:02 CET 2003
rossini at blindglobe.net (A.J. Rossini) writes:
> for Spam.
> In the process of setting up a more effective spam filtering system, I
> just noticed that bogofilter, which implements extensions of the (a?)
> "Naive Bayes" text classification approach, will dump out R data
> frames; the man page suggests how to "integrate" it with R for
> verification. (sort of, that is).
> Anyway, for those of you looking for silly and perhaps interesting
> problems/datasets for your engineering or comp-sci statistics classes,
> this one looks quite amusing...
> Looks like Eric Raymond knows (about) R -- a script is apparently
> included in the source according to the man page, though I couldn't
> find it in the Debian package.
The text in http://www.bgl.nu/bogofilter/BcrFisher.html certainly has
one. It could be interesting to try and figure out what is actually
going on there - some of it certainly looks weird, and last time I
looked at "Naive Bayes" I got the impression that these people would
label anything returning a probability as "Bayesian"...
O__ ---- Peter Dalgaard Blegdamsvej 3
c/ /'_ --- Dept. of Biostatistics 2200 Cph. N
(*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918
~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk) FAX: (+45) 35327907
More information about the R-help