[R] silhouette: clustering labels have to be consecutive intergers starting from 1?

Benilton Carvalho bcarvalh at jhsph.edu
Wed Oct 10 05:31:15 CEST 2007


that happened to me with R-2.4.0 (alpha) and was fixed on R-2.4.0  
(final)...

http://tolstoy.newcastle.edu.au/R/e2/help/06/11/5061.html

then i stopped using... now, the problem seems to be back. The same  
examples still apply.

This fails:

require(cluster)
set.seed(1)
x <- rnorm(100)
g <- sample(2:4, 100, rep=T)
for (i in 1:100){
   print(i)
   tmp <- silhouette(g, dist(x))
}

and this works:

require(cluster)
set.seed(1)
x <- rnorm(100)
g <- sample(2:4, 100, rep=T)
for (i in 1:100){
   print(i)
   tmp <- silhouette(as.integer(factor(g)), dist(x))
}

and here's the sessionInfo():

 > sessionInfo()
R version 2.6.0 (2007-10-03)
x86_64-unknown-linux-gnu

locale:
LC_CTYPE=en_US.UTF-8;LC_NUMERIC=C;LC_TIME=en_US.UTF-8;LC_COLLATE=en_US.U 
TF-8;LC_MONETARY=en_US.UTF-8;LC_MESSAGES=en_US.UTF-8;LC_PAPER=en_US.UTF- 
8;LC_NAME=C;LC_ADDRESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=en_US.UTF-8;LC_ID 
ENTIFICATION=C

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base

other attached packages:
[1] cluster_1.11.9


(Red Hat EL 2.6.9-42 smp - AMD opteron 848)

b

On Oct 9, 2007, at 8:35 PM, Tao Shi wrote:

> Hi list,
>
> When I was using 'silhouette' from the 'cluster' package to  
> calculate clustering performances, R crashed.  I traced the problem  
> to the fact that my clustering labels only have 2's and 3's.  when  
> I replaced them with 1's and 2's, the problem was solved.  Is the  
> function purposely written in this way so when I have clustering  
> labels, "2" and "3", for example, the function somehow takes the  
> 'missing' cluster "2" into account when it calculates silhouette  
> widths?
>
> Thanks,
>
> ....Tao
>
> ##============================================
> ## sorry about the long attachment
>
>> R.Version()
> $platform
> [1] "i386-pc-mingw32"
>
> $arch
> [1] "i386"
>
> $os
> [1] "mingw32"
>
> $system
> [1] "i386, mingw32"
>
> $status
> [1] ""
>
> $major
> [1] "2"
>
> $minor
> [1] "5.1"
>
> $year
> [1] "2007"
>
> $month
> [1] "06"
>
> $day
> [1] "27"
>
> $`svn rev`
> [1] "42083"
>
> $language
> [1] "R"
>
> $version.string
> [1] "R version 2.5.1 (2007-06-27)"
>
>> library(cluster)
>> cl1   ## clustering labels
>  [1] 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 2 2 2 2 2 2 2 2 2
> [30] 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
> [59] 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
> [88] 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
> [117] 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
> [146] 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
> [175] 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
> [204] 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
>> x1  ## 1-d input vector
>  [1] 1.5707963 1.5707963 1.5707963 1.5707963 1.5707963
>  [6] 1.5707963 1.5707963 1.5707963 1.5707963 1.5707963
> [11] 1.5707963 1.5707963 1.5707963 1.5707963 1.5707963
> [16] 1.5707963 1.5707963 1.5707963 1.5707963 1.5707963
> [21] 1.0163758 0.7657763 0.7370084 0.6999689 0.7366476
> [26] 0.7883921 0.6925395 0.7729240 0.7202391 0.7910149
> [31] 0.7397698 0.7958092 0.6978596 0.7350255 0.7294362
> [36] 0.6125713 0.7174000 0.7413046 0.7044205 0.7568104
> [41] 0.7048469 0.7334515 0.7143170 0.7002311 0.7540981
> [46] 0.7627527 0.7712762 0.8193611 0.7801148 0.9061762
> [51] 0.8248195 0.7932630 0.7248037 0.7423547 0.6419314
> [56] 0.6001092 0.7572272 0.7631742 0.7085384 0.8710853
> [61] 0.6589563 0.7464943 0.7487340 0.7751280 0.7946542
> [66] 0.7666081 0.8508109 0.8314308 0.7442471 0.8006093
> [71] 0.7949156 0.7852447 0.7630048 0.7104764 0.6768218
> [76] 0.6806351 0.7255355 0.7431389 0.7523627 0.7670515
> [81] 0.8118214 0.7215615 0.8186164 0.6941610 0.8285453
> [86] 0.8395170 0.8088044 0.8182706 0.7550723 0.7948639
> [91] 0.7204830 0.7109068 0.7756949 0.6837856 0.7055604
> [96] 0.6126666 0.7201964 0.6849890 0.7779753 0.7845284
> [101] 0.9370788 0.8242935 0.6908860 0.6446151 0.7660386
> [106] 0.8141526 0.8111984 0.8624186 0.7865335 0.8213035
> [111] 0.8059171 0.6735751 0.7815353 0.6972508 0.6699396
> [116] 0.6293971 0.7475913 0.7700821 0.8258339 0.8096144
> [121] 0.7058171 0.7516635 0.7323909 0.7229136 0.8344846
> [126] 0.7205433 0.8287774 0.8322097 0.7767547 0.7402277
> [131] 0.7939879 0.7797308 0.7112453 0.7091554 0.6417382
> [136] 0.6369171 0.7059020 0.7496380 0.7298359 0.8202566
> [141] 0.7331830 0.7344492 0.8316894 0.7323979 0.7977615
> [146] 0.7841205 0.7587060 0.8056685 0.7895643 0.8140731
> [151] 0.7890221 0.8016008 0.7381577 0.6936453 0.7133525
> [156] 0.7121459 0.6851448 0.7946275 0.8077618 0.7899059
> [161] 0.7128826 0.7546289 0.7042451 0.6606403 0.7525233
> [166] 0.7527548 0.8098887 0.8254190 0.7873064 0.8139340
> [171] 0.7903462 0.8377651 0.6709983 0.7423632 0.6632082
> [176] 0.5676717 0.6925125 0.7077083 0.7488877 0.7630604
> [181] 0.7843001 0.7524471 0.6871823 0.7144443 0.7692206
> [186] 0.8690710 0.9282786 0.7844991 0.7094671 0.7578409
> [191] 0.8026643 0.7759241 0.6997376 0.6167209 0.6682289
> [196] 0.6572018 0.7615807 0.7415752 0.7659161 0.7040360
> [201] 0.6874460 0.7052109 0.8290970 0.6915149 0.7173107
> [206] 0.7848961 0.7943846 0.8437946 0.7817344 0.8867006
> [211] 0.7575857 0.8390473 0.7382348 0.6789859 0.7129010
> [216] 0.6938173 0.7384170 0.6747648 0.7203337 0.7278963
>>  silhouette(cl1, dist(x1)^2)  #####  CRASHED! ######
>> silhouette(ifelse(cl1==3,2,1), dist(x1)^2)
>       cluster neighbor sil_width
>  [1,]       2        1 1.0000000
>  [2,]       2        1 1.0000000
>  [3,]       2        1 1.0000000
>  [4,]       2        1 1.0000000
>  [5,]       2        1 1.0000000
>  [6,]       2        1 1.0000000
>  [7,]       2        1 1.0000000
>  [8,]       2        1 1.0000000
>  [9,]       2        1 1.0000000
> [10,]       2        1 1.0000000
> [11,]       2        1 1.0000000
> [12,]       2        1 1.0000000
> [13,]       2        1 1.0000000
> [14,]       2        1 1.0000000
> [15,]       2        1 1.0000000
> [16,]       2        1 1.0000000
> [17,]       2        1 1.0000000
> [18,]       2        1 1.0000000
> [19,]       2        1 1.0000000
> [20,]       2        1 1.0000000
> [21,]       1        2 0.7592857
> [22,]       1        2 0.9934455
> [23,]       1        2 0.9937880
> [24,]       1        2 0.9909544
> [25,]       1        2 0.9937769
> [26,]       1        2 0.9912442
> [27,]       1        2 0.9900156
> [28,]       1        2 0.9929499
> [29,]       1        2 0.9929125
> [30,]       1        2 0.9908637
> [31,]       1        2 0.9938610
> [32,]       1        2 0.9900958
> [33,]       1        2 0.9906993
> [34,]       1        2 0.9937227
> [35,]       1        2 0.9934823
> [36,]       1        2 0.9740954
> [37,]       1        2 0.9926948
> [38,]       1        2 0.9938924
> [39,]       1        2 0.9914623
> [40,]       1        2 0.9938250
> [41,]       1        2 0.9915088
> [42,]       1        2 0.9936633
> [43,]       1        2 0.9924367
> [44,]       1        2 0.9909855
> [45,]       1        2 0.9938891
> [46,]       1        2 0.9936028
> [47,]       1        2 0.9930799
> [48,]       1        2 0.9848568
> [49,]       1        2 0.9922685
> [50,]       1        2 0.9371272
> [51,]       1        2 0.9832647
> [52,]       1        2 0.9905154
> [53,]       1        2 0.9932217
> [54,]       1        2 0.9939101
> [55,]       1        2 0.9810071
> [56,]       1        2 0.9708675
> [57,]       1        2 0.9938131
> [58,]       1        2 0.9935827
> [59,]       1        2 0.9918943
> [60,]       1        2 0.9628701
> [61,]       1        2 0.9844965
> [62,]       1        2 0.9939491
> [63,]       1        2 0.9939495
> [64,]       1        2 0.9927610
> [65,]       1        2 0.9902895
> [66,]       1        2 0.9933968
> [67,]       1        2 0.9734481
> [68,]       1        2 0.9811285
> [69,]       1        2 0.9939341
> [70,]       1        2 0.9892304
> [71,]       1        2 0.9902461
> [72,]       1        2 0.9916649
> [73,]       1        2 0.9935909
> [74,]       1        2 0.9920846
> [75,]       1        2 0.9876779
> [76,]       1        2 0.9882868
> [77,]       1        2 0.9932665
> [78,]       1        2 0.9939213
> [79,]       1        2 0.9939182
> [80,]       1        2 0.9933699
> [81,]       1        2 0.9868129
> [82,]       1        2 0.9930074
> [83,]       1        2 0.9850624
> [84,]       1        2 0.9902300
> [85,]       1        2 0.9820895
> [86,]       1        2 0.9781906
> [87,]       1        2 0.9875197
> [88,]       1        2 0.9851569
> [89,]       1        2 0.9938688
> [90,]       1        2 0.9902547
> [91,]       1        2 0.9929304
> [92,]       1        2 0.9921257
> [93,]       1        2 0.9927096
> [94,]       1        2 0.9887702
> [95,]       1        2 0.9915856
> [96,]       1        2 0.9741195
> [97,]       1        2 0.9929094
> [98,]       1        2 0.9889500
> [99,]       1        2 0.9924910
> [100,]       1        2 0.9917552
> [101,]       1        2 0.9047049
> [102,]       1        2 0.9834247
> [103,]       1        2 0.9897916
> [104,]       1        2 0.9815845
> [105,]       1        2 0.9934304
> [106,]       1        2 0.9862375
> [107,]       1        2 0.9869624
> [108,]       1        2 0.9677353
> [109,]       1        2 0.9914973
> [110,]       1        2 0.9843076
> [111,]       1        2 0.9881568
> [112,]       1        2 0.9871393
> [113,]       1        2 0.9921114
> [114,]       1        2 0.9906240
> [115,]       1        2 0.9865148
> [116,]       1        2 0.9781846
> [117,]       1        2 0.9939511
> [118,]       1        2 0.9931681
> [119,]       1        2 0.9829519
> [120,]       1        2 0.9873341
> [121,]       1        2 0.9916130
> [122,]       1        2 0.9939273
> [123,]       1        2 0.9936196
> [124,]       1        2 0.9930999
> [125,]       1        2 0.9800620
> [126,]       1        2 0.9929347
> [127,]       1        2 0.9820138
> [128,]       1        2 0.9808614
> [129,]       1        2 0.9926103
> [130,]       1        2 0.9938711
> [131,]       1        2 0.9903987
> [132,]       1        2 0.9923097
> [133,]       1        2 0.9921578
> [134,]       1        2 0.9919558
> [135,]       1        2 0.9809652
> [136,]       1        2 0.9799023
> [137,]       1        2 0.9916220
> [138,]       1        2 0.9939454
> [139,]       1        2 0.9935022
> [140,]       1        2 0.9846059
> [141,]       1        2 0.9936526
> [142,]       1        2 0.9937017
> [143,]       1        2 0.9810402
> [144,]       1        2 0.9936199
> [145,]       1        2 0.9897557
> [146,]       1        2 0.9918058
> [147,]       1        2 0.9937665
> [148,]       1        2 0.9882099
> [149,]       1        2 0.9910776
> [150,]       1        2 0.9862575
> [151,]       1        2 0.9911553
> [152,]       1        2 0.9890393
> [153,]       1        2 0.9938209
> [154,]       1        2 0.9901624
> [155,]       1        2 0.9923515
> [156,]       1        2 0.9922418
> [157,]       1        2 0.9889731
> [158,]       1        2 0.9902939
> [159,]       1        2 0.9877542
> [160,]       1        2 0.9910280
> [161,]       1        2 0.9923092
> [162,]       1        2 0.9938784
> [163,]       1        2 0.9914431
> [164,]       1        2 0.9848184
> [165,]       1        2 0.9939159
> [166,]       1        2 0.9939125
> [167,]       1        2 0.9872706
> [168,]       1        2 0.9830805
> [169,]       1        2 0.9913937
> [170,]       1        2 0.9862925
> [171,]       1        2 0.9909633
> [172,]       1        2 0.9788584
> [173,]       1        2 0.9866989
> [174,]       1        2 0.9939102
> [175,]       1        2 0.9853007
> [176,]       1        2 0.9617883
> [177,]       1        2 0.9900120
> [178,]       1        2 0.9918102
> [179,]       1        2 0.9939489
> [180,]       1        2 0.9935882
> [181,]       1        2 0.9917836
> [182,]       1        2 0.9939170
> [183,]       1        2 0.9892708
> [184,]       1        2 0.9924478
> [185,]       1        2 0.9932287
> [186,]       1        2 0.9640487
> [187,]       1        2 0.9150126
> [188,]       1        2 0.9917589
> [189,]       1        2 0.9919865
> [190,]       1        2 0.9937946
> [191,]       1        2 0.9888295
> [192,]       1        2 0.9926884
> [193,]       1        2 0.9909269
> [194,]       1        2 0.9751339
> [195,]       1        2 0.9862132
> [196,]       1        2 0.9841566
> [197,]       1        2 0.9936557
> [198,]       1        2 0.9938973
> [199,]       1        2 0.9934375
> [200,]       1        2 0.9914201
> [201,]       1        2 0.9893087
> [202,]       1        2 0.9915481
> [203,]       1        2 0.9819092
> [204,]       1        2 0.9898774
> [205,]       1        2 0.9926876
> [206,]       1        2 0.9917091
> [207,]       1        2 0.9903339
> [208,]       1        2 0.9764847
> [209,]       1        2 0.9920887
> [210,]       1        2 0.9526866
> [211,]       1        2 0.9938025
> [212,]       1        2 0.9783714
> [213,]       1        2 0.9938230
> [214,]       1        2 0.9880267
> [215,]       1        2 0.9923108
> [216,]       1        2 0.9901850
> [217,]       1        2 0.9938279
> [218,]       1        2 0.9873388
> [219,]       1        2 0.9929195
> [220,]       1        2 0.9934017
> attr(,"Ordered")
> [1] FALSE
> attr(,"call")
> silhouette.default(x = ifelse(cl1 == 3, 2, 1), dist = dist(x1)^2)
> attr(,"class")
> [1] "silhouette"
>
> ## other examples
>> set.seed(1234)
>> cl.tmp <- rep(2:3, each=5)
>> x.tmp <- c(rep(-1,5), abs(rnorm(5)+3))
>> silhouette(cl.tmp, dist(x.tmp))
>      cluster neighbor  sil_width
> [1,]       2        1        NaN
> [2,]       2        1        NaN
> [3,]       2        1        NaN
> [4,]       2        1        NaN
> [5,]       2        1        NaN
> [6,]       3        2 -0.5736515
> [7,]       3        2 -0.1557143
> [8,]       3        2 -0.2922523
> [9,]       3        2 -0.8340174
> [10,]       3        2 -0.1511875
> attr(,"Ordered")
> [1] FALSE
> attr(,"call")
> silhouette.default(x = cl.tmp, dist = dist(x.tmp))
> attr(,"class")
> [1] "silhouette"
>> silhouette(ifelse(cl.tmp==2,1,2), dist(x.tmp))
>      cluster neighbor  sil_width
> [1,]       1        2  1.0000000
> [2,]       1        2  1.0000000
> [3,]       1        2  1.0000000
> [4,]       1        2  1.0000000
> [5,]       1        2  1.0000000
> [6,]       2        1  0.4136253
> [7,]       2        1  0.7038917
> [8,]       2        1  0.6467668
> [9,]       2        1 -0.3360695
> [10,]       2        1  0.7054709
> attr(,"Ordered")
> [1] FALSE
> attr(,"call")
> silhouette.default(x = ifelse(cl.tmp == 2, 1, 2), dist = dist(x.tmp))
> attr(,"class")
> [1] "silhouette"
>> silhouette(ifelse(cl.tmp==2,1,3), dist(x.tmp))
>      cluster neighbor  sil_width
> [1,]       1        2        NaN
> [2,]       1        2        NaN
> [3,]       1        2        NaN
> [4,]       1        2        NaN
> [5,]       1        2        NaN
> [6,]       3        1 -0.7694686
> [7,]       3        1 -0.8167313
> [8,]       3        1 -0.6054665
> [9,]       3        1 -0.9037412
> [10,]       3        1  0.1875360
> attr(,"Ordered")
> [1] FALSE
> attr(,"call")
> silhouette.default(x = ifelse(cl.tmp == 2, 1, 3), dist = dist(x.tmp))
> attr(,"class")
> [1] "silhouette"
>
> _________________________________________________________________
>
> It’s free. http://im.live.com/messenger/im/home/?source=TAGHM
>
> <mime-attachment.txt>



More information about the R-help mailing list