[R] which function to use to do classification

Baoqiang Cao caobg at email.uc.edu
Wed Mar 29 17:13:38 CEST 2006


>On Wed, 29 Mar 2006, Sean Davis wrote:
>
>> We have to be careful here.  Classification (which is the terminology that
>> the original poster used) is NOT the same as clustering, although the two
>> are often confused.
>
>Well, in one of its two English senses it is the same.  From a recent talk 
>of mine (GfKL30), quoting the Concise Oxford Dictionary:
>
>\emph{Classification} has two senses:
>
>\begin{itemize}
>\item `to arrange in classes or categories'
>\item `assign (a thing) to a class or category'
>\end{itemize}
>
>There is a community (q.v. the International Federation of Classification 
>Societies and Journal of Classification as well as the entry in the 
>original Encyclopedia of Statistical Sciences) that meams (almost) 
>entirely the first sense.
>
>To add to this, the similar words to classification in e.g. French or 
>German have (I am told) different shades of meaning.
>
>
>> If the original poster wants to do clustering and
>> examine the results for the presence of three clusters, that is fine and
>> there are many methods for clustering that could be used.  However,
>> classification will require a different set of tools.  If the clustering
>> tools already pointed out are not doing what is needed (that is, that Cao
>> actually is interested in clustering and not classification), then perhaps a
>> further explanation of what the problem would help clarify.
>
>Yes, further explanation would help.
My intension is to arrange all the samples in classes. As a non-native English speaker, I should have checked the word before I actually use it to express myself. The quoting makes perfect sense to me. Appreciate!

Thank you Jacques and Martin, your comments and suggestion are well received!

Best,
 Baoqiang Cao

>
>> Sean
>>
>>
>> On 3/29/06 1:46 AM, "Jacques VESLOT" <jacques.veslot at cirad.fr> wrote:
>>
>>> try this (suppose mat is your matrix):
>>>
>>> hc <- hclust(dist(mat,"manhattan"), "ward")
>>> plot(hc, hang=-1)
>>> (x <- identify(hc)) # rightclick to stop
>>> cutree(hc, 3)
>>>
>>> km<- kmeans(mat, 3)
>>> km$cluster
>>> km$centers
>>>
>>> pam(daisy(mat, metric = "manhattan"), k=3, diss=T)$clust
>>>
>>>
>>>
>>> Baoqiang Cao a écrit :
>>>
>>>> Thanks!
>>>> I tried kmeans, the results is not very positive. Anyway, thanks Jacques!
>>>> Please let me know if you have any other thoughts!
>>>>
>>>> Best regards,
>>>>    Baoqiang Cao
>>>>
>>>> ======= At 2006-03-29, 00:08:44 you wrote: =======
>>>>
>>>>
>>>>
>>>>> if you want to classify rows or columns, read:
>>>>> ?hclust
>>>>> ?kmeans
>>>>> library(cluster)
>>>>> ?pam
>>>>>
>>>>>
>>>>> Baoqiang Cao a écrit :
>>>>>
>>>>>
>>>>>
>>>>>> Dear All,
>>>>>>
>>>>>> I have a data, suppose it is an N*M matrix data. All I want is to classify
>>>>>> it into, let see, 3 classes. Which method(s) do you think is(are)
>>>>>> appropriate for this purpose? Any reference will be welcome! Thanks!
>>>>>>
>>>>>> Best,
>>>>>> Baoqiang Cao
>>>>>>
>>>>>>
>>>>>>
>>>>>> ------------------------------------------------------------------------
>>>>>>
>>>>>> ______________________________________________
>>>>>> R-help at stat.math.ethz.ch mailing list
>>>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>>>> PLEASE do read the posting guide!
>>>>>> http://www.R-project.org/posting-guide.html
>>>>>>
>>>>>>
>>>>>>
>>>>> .
>>>>>
>>>>>
>>>>
>>>> = = = = = = = = = = = = = = = = = = = =
>>>>
>>>> Baoqiang Cao
>>>> caobg at email.uc.edu
>>>> 2006-03-29
>>>>
>>>>
>>>>
>>>>
>>>
>>> ______________________________________________
>>> R-help at stat.math.ethz.ch mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
>>
>> ______________________________________________
>> R-help at stat.math.ethz.ch mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
>>
>
>-- 
>Brian D. Ripley,                  ripley at stats.ox.ac.uk
>Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
>University of Oxford,             Tel:  +44 1865 272861 (self)
>1 South Parks Road,                     +44 1865 272866 (PA)
>Oxford OX1 3TG, UK                Fax:  +44 1865 272595

= = = = = = = = = = = = = = = = = = = =
			
Baoqiang Cao
caobg at email.uc.edu
2006-03-29




More information about the R-help mailing list