[R] Looking for categorization method/module in R

David Winsemius dwinsemius at comcast.net
Tue Dec 15 15:56:50 CET 2009


On Dec 15, 2009, at 7:19 AM, James Mcininch wrote:

> All,
>
> I'm relatively new to using R, having used it thus far for some simple
> statistics and plotting. However, I'm not new to programming by any
> measure.
>
> I've been looking at the various modules available for clustering,
> factor analysis, etc. and find that I need advice on which modules I
> should be focusing on and their application.

The list is not really advertised as offering general statistical  
advice, but is more responsive to focussed questions on R use. There  
is the option of reviewing the Task Views:
http://cran.r-project.org/web/views/


>
> I have a data set comprised of columns of both quantitative and
> qualitative / non-numeric attributes. I would like to perform two
> operations on this data: identify correlations between attributes,
> and cluster the records by attribute.
>
> All of the clustering algorithms that I've looked at so far are based
> on numerical distance functions, and it's not clear to me how I'd
> apply them to qualitative attributes. It's not appropriate to simple
> convert discrete qualitative attributes (e.g., native language) to
> numerical values or independent columns with binary values. Is there a
> module that provides such an algorithm or that can be adapted to this
> purpose?
>
> I can wrap my head around the problem of looking for cross-correlation
> between the attributes, but would appreciate any insight in how to
> do it most efficiently and present the results.
>
> Thank you.
>
>
> 	[[alternative HTML version deleted]]

David Winsemius, MD
Heritage Laboratories
West Hartford, CT




More information about the R-help mailing list