[Rd] k means

friedrich.leisch at stat.uni-muenchen.de friedrich.leisch at stat.uni-muenchen.de
Mon May 19 16:35:19 CEST 2008


>>>>> On Sat, 17 May 2008 00:54:55 +0200,
>>>>> cgenolin  (c) wrote:

  > Hi the list
  > I try the flexclust, but I do not manage to see what is wrong in my
  > (very simple) code...
  > Will you have few minutes to check it?

  > Thanks for your help.

  > Christophe
  > --- 8< --------------------------------
  > data  <- rbind(c(1,2 ,NA,4 ),
  >                c(1,1 ,NA,1 ),
  >                c(2,3 ,4 ,5 ),
  >                c(2,2 ,2 ,2 ),
  >                c(3,NA,NA,6 ),
  >                c(3,NA,NA,3 ),
  >                c(2,4 ,4 ,NA),
  >                c(2,3 ,2 ,NA))

  > distTest <- rbind(c(0,0,0,0),
  >                   c(1,1,1,1))

  > distNA <- function(x,centers){
  >     z <- matrix(0,nrow=nrow(x),ncol=nrow(centers))
  >     for(k in 1:nrow(centers)){
  >         z[,k]<- apply(x,1,function(x){dist(rbind(x,centers[k,]))})
  >     }
  >     z
  > }

  > distNA(data,distTest)

  > km <- kccaFamily(dist=distNA,cent=colMeans)
  > kcca(x=data,k=2,family=km)
  > kcca(x=data,k=3,family=km)

I don't think this is really appropriate for r-devel, you should
either ask the package author (me), or r-help.

Anyway, colMeans will not remove the missing values by default, so you
need also a special function for centroid computation:

R> centNA <- function(x) colMeans(x, na.rm=TRUE)
R> km <- kccaFamily(dist=distNA,cent=centNA)
R> kcca(x=data,k=2,family=km)
kcca object of family ??distNA?? 

call:
kcca(x = data, k = 2, family = km)

cluster sizes:

1 2 
5 3 


Hope this helps,
Fritz

-- 
-----------------------------------------------------------------------
Prof. Dr. Friedrich Leisch 

Institut für Statistik                          Tel: (+49 89) 2180 3165
Ludwig-Maximilians-Universität                  Fax: (+49 89) 2180 5308
Ludwigstraße 33
D-80539 München                     http://www.statistik.lmu.de/~leisch
-----------------------------------------------------------------------
   Journal Computational Statistics --- http://www.springer.com/180 
          Münchner R Kurse --- http://www.statistik.lmu.de/R



More information about the R-devel mailing list