[R] Clustering question \ dist(datmat)

Sean Davis sdavis2 at mail.nih.gov
Mon Mar 27 13:30:49 CEST 2006




On 3/27/06 12:19 AM, "kumar zaman" <statbataineh at yahoo.com> wrote:

> Dear Gabor and all ;
>    
>   I know this will work; but i already have a distance matrix calculated using
> my distance measure Dij = 0.5 * ( 1 - cos(theta_i - theta_j)), if i do
> hclust(as.dist(df)) then i am taking distance another time for a matrix " df "
> which is supposed to be a distance matrix, i hope i am clear ;
>    
>   ps: I just found out i can use " kmeans(df, 3, iter.max=100)" it will take
> df as calculated by Dij. I still need to use methods in hclust like " single,
> average, ward, median, mcquitty, ...etc"
>    
>   Thank u anyway.

Kumar,

If I understand Your point, you are misunderstanding what as.dist() does.
It does not compute a distance matrix.  Instead, it simply makes a matrix
into a "dist" object, which is NOT just a matrix.  However, the distances in
a matrix converted to a "dist" object are not altered.  Therefore, you are
not "taking distance another time"; instead, you are simply converting the
distance matrix into a form that hclust can understand.

Hope that helps clarify.

Sean


 
> Gabor Grothendieck <ggrothendieck at gmail.com> wrote:
>   A distance matrix must be of class "dist". Try
> 
> hclust(as.dist(df))
> 
> 
> On 3/26/06, kumar zaman wrote:
>> Hello everybody. I am trying to cluster circular data (data points which are
>> angles), thus i can not use the "dist" function in "mclust" to generate my
>> distance matrix, I am using the function " Dij = 0.5*( 1 - cos(theta_i -
>> theta_j)). The thing is "hclust" will not accept this distance matrix, i
>> tried to put it in a data frame, but again i get an error message saying "
>> Error in if (n < 2) stop("must have n >= 2 objects to cluster") : argument is
>> of length zero". The distance matrix "dist" producing is a lower triangular
>> one, mine is a square matrix, which i think does not matter. My question how
>> to make "hclust" process my distance matrix, what i am doing wrong. I am sure
>> the problem is with the distance matrix format, Any suggestions are highly
>> apprciated, the code below shows what i have done.
>> 
>> clust1<- as.vector(rvm(5,5,15))
>> clust2<- as.vector(rvm(5,10,15))
>> clust3<- as.vector(rvm(5,15,15))
>> clust4<- as.vector(rvm(5,20,15))
>> clust5<- as.vector(rvm(5,25,15))
>> data1<- rbind(clust1,clust2,clust3,clust4,clust5)
>> datmat<- matrix(data1,nrow=25,ncol=1,byrow=TRUE)
>> circ.plot(datmat)
>> df<- array(dim=c(25,25))
>> for (i in 1:25){
>> for (j in 1:25){
>> df[i,j]<- 0.5*(1 - cos(datmat[i] - datmat[j]))
>> }
>> }
>> hcA<-hclust(df,method="average")
>> ****************************************************
>> Ahmed
>> Florida
>> 
>> 
>> ---------------------------------
>> 
>> [[alternative HTML version deleted]]
>> 
>> ______________________________________________
>> R-help at stat.math.ethz.ch mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
>> 
> 
> 
> 
> Ahmed Albatineh,PhD
> Assistant Professor of Statistics
> Nova Southeastern University
> Fort Lauderdale, FL 33314
> U.S.A
> 
> ---------------------------------
> 
> [[alternative HTML version deleted]]
> 
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html




More information about the R-help mailing list