[R] NAs introduced by coercion in dist()

Petr PIKAL petr.pikal at precheza.cz
Thu May 3 08:47:54 CEST 2007


r-help-bounces at stat.math.ethz.ch napsal dne 02.05.2007 16:47:55:

> 
> It was suggested that the 'NAs introduced by coercion' message might be
> warning me that my data are not what they should be.  I checked this 
using
> str(PeaksMatrix), as suggested, and the data seem to be what I thought 
they
> were: 
> 
> 'data.frame':   335 obs. of  127 variables:
>  $ Code   : Factor w/ 335 levels "A1MR","A1MU",..: 1 2 3 4 5 6 7 8 9 10 
...
>  $ P3.70  : num  0 0 0 0 0 0 0 0 0 0 ...
>  $ P3.97  : num  0 0 0 0 0 0 0 0 0 0 ...
>  $ P4.29  : num  0 0 0 0 0 0 0 0 0 0 ...
>  $ P4.90  : num  0 0 0 0 0 0 0 0 0 0 ...
>  $ P6.30  : num  0 0 0 0 0 0 0 0 0 0 ...
>  $ P6.45  : num  7.73 0 0 0 0 0 4.03 0 0 0 ...
>  $ P6.55  : num  0 0 0 0 0 0 0 0 0 0 ...
> 
> ...
> 
> I do have 335 observations, 127 variables that are named P3.70, 3.97, 
P4.29,
> etc..  This was a relief, but I still don't know whether the distance 
matrix
> is what it should be.  I tried 'str(dist.PxMx)', which is the name of my
> distance matrix, but I get something that has not much meaning to me, an
> unexperienced R user:
> 
> Class 'dist'  atomic [1:55945] 329.6 194.9 130.1  70.7 116.9 ...
>   ..- attr(*, "Size")= int 335
>   ..- attr(*, "Labels")= chr [1:335] "1" "2" "3" "4" ...
>   ..- attr(*, "Diag")= logi FALSE
>   ..- attr(*, "Upper")= logi FALSE
>   ..- attr(*, "method")= chr "euclidean"
>   ..- attr(*, "call")= language dist(x = PeaksMatrix, method = 
"euclidean",
> diag = FALSE, upper = FALSE,      p = 2)
> 
> Any more suggestions, please?

Well, it seems that you have the data which you want but why you do not 
see them is not clear for me.

I tried:

x<-sample(0:2, 100, replace=T)
dim(x)<-c(10,10)
x
      [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
 [1,]    0    1    0    0    1    1    1    1    0     1
 [2,]    0    1    0    2    1    0    2    0    0     2
 [3,]    0    2    0    0    0    1    1    0    1     2
...
[10,]    1    2    0    0    1    2    0    2    1     0
xx<-data.frame(var=c("a", "b"),x)
xx
   var X1 X2 X3 X4 X5 X6 X7 X8 X9 X10
1    a  0  1  0  0  1  1  1  1  0   1
2    b  0  1  0  2  1  0  2  0  0   2
....
9    a  1  1  0  1  1  0  0  2  2   0
10   b  1  2  0  0  1  2  0  2  1   0

dist(xx, method='euclidean', diag=F,upper=F)
          1        2        3        4        5        6        7        8 
       9
2  2.966479  
3  2.345208 3.146427  
4  3.633180 3.633180 4.571652  
5  4.195235 5.549775 4.571652 4.195235  
6  4.195235 4.195235 4.062019 3.924283 3.924283  
7  1.816590 3.781534 3.316625 3.781534 3.781534 4.806246  
8  2.774887 4.571652 3.633180 4.062019 4.062019 4.806246 3.316625  
9  3.316625 4.449719 4.062019 4.449719 3.316625 4.449719 2.774887 3.146427 
 
10 2.774887 5.029911 3.633180 4.324350 3.146427 4.324350 2.569047 2.966479 
2.774887

xxx<-dist(xx, method='euclidean', diag=F,upper=F)
Warning message:
NAs introduced by coercion 
str(xxx)
Class 'dist'  atomic [1:45] 2.97 2.35 3.63 4.20 4.20 ...
  ..- attr(*, "Size")= int 10
  ..- attr(*, "Diag")= logi FALSE
  ..- attr(*, "Upper")= logi FALSE
  ..- attr(*, "method")= chr "euclidean"
  ..- attr(*, "call")= language dist(x = xx, method = "euclidean", diag = 
F, upper = F)

seems to be similar to what you get. So I wonder why you do not see you 
matrix. Try dist.PxMx[1:50] or head(dist.PxMx) to see if you can get 
something from it.

Regards
Petr

> 
> 
> 
> Silvia Lomascolo wrote:
> > 
> > I work with Windows and use R version 2.4.1. I am JUST starting to 
learn
> > this program...
> > 
> > I get this warning message 'NAs introduced by coercion' while trying 
to
> > build a distance matrix (to be analyzed with NMDS later) from a 336 x 
100
> > data matrix.  The original matrix has lots of zeros and no missing 
values,
> > but I don't think this should matter.
> > 
> > I searched this forum and people have suggested that the warning 
should be
> > ignored but when I try to print the distance matrix I only get the row
> > numbers (the matrix seems to be 'empty') and I'm not being able to 
judge
> > whether the matrix worked or not.
> > 
> > To get the distance matrix I wrote:
> > dist.PxMx <- dist (PeaksMatrix, method='euclidean', diag=FALSE,
> > upper=FALSE)
> > 
> > I tried including the p argument (included in the help for dist()) and
> > leaving it out, but that didn't seem to change anything.  I think 
that's
> > required for one distance measure though, not for euclidean dist. 
> > 
> > Should I really ignore this warning? If so, why am I not being able to 
see
> > the distance matrix?
> > 
> 
> -- 
> View this message in context: http://www.nabble.com/NAs-introduced-by-
> coercion-in-dist%28%29-tf3680727.html#a10286882
> Sent from the R help mailing list archive at Nabble.com.
> 
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list