[R] Text mining

Giovanni Azua bravegag at gmail.com
Sat Jan 26 22:19:52 CET 2013

Hi Steve,

IMO this problem does not need a classifier but rather a database and a
simple query. I would just build a database with all city names including
the geo information, and then say whether it is north or south exactly. 

If there was such a "rule" (which I doubt) I would expect it to have many
exceptions and therefore a bunch of false-positives on both sides. Why
overcomplicate a simple problem? 


-----Original Message-----
From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On
Behalf Of Steve Stephenson
Sent: Saturday, January 26, 2013 10:08 PM
To: r-help at r-project.org
Subject: [R] Text mining

Hallo to everybody,
I would like to perform an analysis but I don't know how to proceed and
whether R packages are available for my purpose or not. Therefore I'm here
to request your support.
*The idea is the following:* I noticed that the names of the towns and
villages in northern Italy most of the time sound differently from names of
cities based on southern Italy. Just to give you an idea "Caronno
Pertusella" is a northern Italy village while Frascati is a center Italy
town. Most of the time I am able to recognize where the town is located just
hearing the name but I cannot say why, that is to say that I didn't find a
What I would like to do is to find a classification rule/engine that is able
to "locate" the city starting from its name. *I think the classification
method should be based on the sequence of letters belonging to the town's
name*. But this is just an intuition not yet formalized!
I know that mine is a strange request and idea, anyway advices are very
appreciated and welcome!
Many thanks in advance to all.


View this message in context:
Sent from the R help mailing list archive at Nabble.com.

R-help at r-project.org mailing list
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

More information about the R-help mailing list