[R] agnes clustering and NAs

Uwe Ligges ligges at statistik.tu-dortmund.de
Thu Jan 27 10:45:50 CET 2011



On 27.01.2011 05:00, Dario Strbenac wrote:
> Hello,
>
> In the documentation for agnes in the package 'cluster', it says that NAs are allowed, and sure enough it works for a small example like :
>
>> m<- matrix(c(
> 1, 1, 1, 2,
> 1, NA, 1, 1,
> 1, 2, 2, 2), nrow = 3, byrow = TRUE)
>> agnes(m)
> Call:    agnes(x = m)
> Agglomerative coefficient:  0.1614168
> Order of objects:
> [1] 1 2 3
> Height (summary):
>     Min. 1st Qu.  Median    Mean 3rd Qu.    Max.
>    1.155   1.247   1.339   1.339   1.431   1.524
>
> Available components:
> [1] "order"  "height" "ac"     "merge"  "diss"   "call"   "method" "data"
>
> But I have a large matrix (23371 rows, 50 columns) with some NAs in it and it runs for about a minute, then gives an error :
>
>> agnes(iMatrix)
> Error in agnes(iMatrix) :
>    No clustering performed, NA-values in the dissimilarity matrix.
>
> I've also tried getting rid of rows with all NAs in them, and it still gave me the same error. Is this a bug in agnes() ? It doesn't seem to fulfil the claim made by its documentation.


I haven't looked in the file, but you need to get rid of all NA, or in 
other words, all rows that contain *any* NA values.

Uwe Ligges



> The matrix I'm using can be obtained here :
> http://129.94.136.7/file_dump/dario/iMatrix.obj
>
> --------------------------------------
> Dario Strbenac
> Research Assistant
> Cancer Epigenetics
> Garvan Institute of Medical Research
> Darlinghurst NSW 2010
> Australia
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list