[R] silhouette fuzzy

pete pieroleone at hotmail.it
Mon Jan 31 22:46:04 CET 2011


After ordering the table of membership degrees , i must get the difference
between the first and second coloumns , between the first and second largest
membership degree of object i. This for K=2,K=3,....to K.max=6.
This difference is multiplyed by the Crisp silhouette index vector (si). Too
it dependending on K=2,...,K.max=6; the result divided by the sum of these
differences
 I need a final vector composed of the indexes for each clustering
(K=2,...,K.max=6).
There is a method, i think that is classe.memb, but i can't to solve problem
because trasformation of the membership degrees matrix( (ris$membership) and
of  list object (ris$silinfo), does not permit    me to use classe.memb
propertyes.
.

Σí(uί1-uí2)sí/Σí(uí1-uí2)


> head(t(A.sort))     membership degrees table ordering by max to min value
  [,1] [,2] [,3] [,4]
1 0.66 0.30 0.04 0.01
2 0.89 0.09 0.02 0.00
3 0.92 0.06 0.01 0.01
4 0.71 0.21 0.07 0.01
5 0.85 0.10 0.04 0.01
6 0.91 0.04 0.02 0.02
> head(t(A.sort))
  [,1] [,2] [,3] [,4]
1 0.66 0.30 0.04 0.01
2 0.89 0.09 0.02 0.00
3 0.92 0.06 0.01 0.01
4 0.71 0.21 0.07 0.01
5 0.85 0.10 0.04 0.01
6 0.91 0.04 0.02 0.02
> H.Asort=head(t(A.sort))
> H.Asort[,1]-H.Asort[,2]
   1    2    3    4    5    6 
0.36 0.80 0.86 0.50 0.75 0.87 

> H.Asort=t(H.Asort[,1]-H.Asort[,2])
This is the differences vector by multiplying trasformed table ris$silinfo.
> ris$silinfo
$widths
   cluster neighbor   sil_width
72       1        3  0.43820207
54       1        3  0.43427773
29       1        6  0.41729079
62       1        6  0.40550562
64       1        6  0.32686757
32       1        3  0.30544722
45       1        3  0.30428723
79       1        3  0.30192624
12       1        3  0.30034472
60       1        6  0.29642495
41       1        3  0.29282778
1        1        3  0.28000788
85       1        3  0.24709237
74       1        3  0.239




> P=ris$silinfo
> P=P[1]
>  P=as.data.frame(P)
>  V4=rownames(P)
>  mode(V4)="numeric"
>  P[,4]=V4
>  P[order(P$V4),]

   widths.cluster widths.neighbor widths.sil_width V4
1               1               3       0.28000788  1
2               2               4       0.07614849  2
3               2               3      -0.11676440  3
4               2               4       0.15436648  4
5               2               3       0.14693927  5
6               3               1       0.57083836  6
7               4               5       0.36391826  7
8               5               4       0.63491118  8
9               4               2       0.54458733  9
10              5               4       0.51059626 10
11              2               5       0.03908952 11
12              1               3       0.30034472 12
13              1               3      -0.04928562 13
14              4               3       0.20337180 14
15              3               4       0.46164324 15
18              5               4       0.52066782 18
20              4               3       0.45517287 20
21              3               4       0.39405507 21
22              4               5       0.05574547 22
23              6               1      -0.06750403 23
> P= P[order(P$V4),]

P=P[,3]
 This is trasformed vector ris$silinfo =P.
I can't to use this vector object in the classe.memb. 
K=2
K.max=6
while (K<=K.max)
 {
 
ris=fanny(frj,K,memb.exp=m,metric="SqEuclidean",stand=TRUE,maxit=1000,tol=1e-6)
  ris$centroid=matrix(0,nrow=K,ncol=J)
  for (k in 1:K)
   {
   
ris$centroid[k,]=(t(ris$membership[,k]^m)%*%as.matrix(frj))/sum(ris$membership[,k]^m)
   }
  rownames(ris$centroid)=1:K
  colnames(ris$centroid)=colnames(frj)
  print(K)
  print(round(ris$centroid,2))
  print(classe.memb(ris$membership)$table.U)
  print(ris$silinfo$avg.width)
  K=K+1
 }
this should be scheme clearly are determined centroid based on classe.memb.

classe.memb=function(U)
{
 info.U=cbind(max.col(U),apply(U,1,max))
 i=1
 while (i <= nrow(U))
  { 
   if (apply(U,1,max)[i]<0.5) info.U[i,1]=0
   i=i+1
  }
 K=ncol(U)
 table.U=matrix(0,nrow=K,ncol=4)
 cl=1
 while (cl <= K)
  {
   table.U[cl,1] = length(which(info.U[info.U[,1]==cl,2]>=.90))
   table.U[cl,2] = length(which(info.U[info.U[,1]==cl,2]>=.70)) -
table.U[cl,1]
   table.U[cl,3] = length(which(info.U[info.U[,1]==cl,2]>=.50)) -
table.U[cl,1] - table.U[cl,2]
   table.U[cl,4] = sum(table.U[cl,]) 
   cl = cl+1
  }
 rownames(table.U) = c(1:K)
 colnames(table.U) = c("Alto", "Medio", "Basso", "Totale")
 out=list()
 out$info.U=round(info.U,2)
 out$table.U=table.U
 return(out)
}
-- 
View this message in context: http://r.789695.n4.nabble.com/silhouette-fuzzy-tp3249893p3249893.html
Sent from the R help mailing list archive at Nabble.com.



More information about the R-help mailing list