[BioC] error in hclust function

Thomas Girke thomas.girke at ucr.edu
Sat Jun 2 17:14:40 CEST 2012


You probably forgot to remove the zero variance columns in your matrix. In the
step where you are observing the error you are clustering the columns of bb not
its rows, since the cor() functions operates on the columns of a matrix not its
rows. Running things stepwise might help to pinpoint the problem:

## Sample data
bb <- matrix(1:5, 5, 5, dimnames=list(paste("g", 1:5, sep=""), paste("t", 1:5, sep="")), byrow=TRUE)> bb
   t1 t2 t3 t4 t5
g1  1  2  3  4  5
g2  1  2  3  4  5
g3  1  2  3  4  5
g4  1  2  3  4  5
g5  1  2  3  4  5

> bb
   t1 t2 t3 t4 t5
g1  1  2  3  4  5
g2  1  2  3  4  5
g3  1  2  3  4  5
g4  1  2  3  4  5
g5  1  2  3  4  5

## cor() without t()
> cor(bb)
   t1 t2 t3 t4 t5
t1  1 NA NA NA NA
t2 NA  1 NA NA NA
t3 NA NA  1 NA NA
t4 NA NA NA  1 NA
t5 NA NA NA NA  1
Warning message:
In cor(bb) : the standard deviation is zero

## cor() with t()
> cor(t(bb))
   g1 g2 g3 g4 g5
g1  1  1  1  1  1
g2  1  1  1  1  1
g3  1  1  1  1  1
g4  1  1  1  1  1
g5  1  1  1  1  1

## hclust without t()
> hc <- hclust(as.dist(1-cor(bb)))
Error in hclust(as.dist(1 - cor(bb))) : 
  NA/NaN/Inf in foreign function call (arg 11)
In addition: Warning message:
In cor(bb) : the standard deviation is zero

## hclust with t()
hc <- hclust(as.dist(1-cor(t(bb))))

Thomas

On Sat, Jun 02, 2012 at 12:57:46PM +0000, Alyaa Mahmoud wrote:
> Hi All
> 
> I am trying to cluster 57 COGs in 24 datasets. I use the following code and
> run into this error:
> 
> hc = NULL
> hc <- hclust(as.dist(1-cor(as.matrix(bb), method="spearman")),
> method="complete", members=NULL)
> 
> Error in hclust(as.dist(1 - cor(as.matrix(bb), method = "spearman")),  :
>   NA/NaN/Inf in foreign function call (arg 11)
> In addition: Warning message:
> In cor(as.matrix(bb), method = "spearman") : the standard deviation is zero
> 
> hr = NULL
> hr <- hclust(as.dist(1-cor(t(as.matrix(bb)), method="spearman")),
> method="complete", members=NULL)
> 
> I tried to remove any rows that have sd of zero but there was none;
> ind <- apply(bb, 1, var) == 0
> subset <- bb[!ind,]
> 
> or
> 
> ind <- apply(bb, 1, sd) == 0
> subset <- bb[!ind,]
> 
> 
> any clue what coule the problem be ?
> 
> Thanks a lot for your help
> yours,
> Alyaa
> -- 
> Alyaa Mahmoud
> 
> "Love all, trust a few, do wrong to none"- Shakespeare
> 
> 	[[alternative HTML version deleted]]
> 
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor



More information about the Bioconductor mailing list