[R] locating element in distance matrix

David L Carlson dcarlson at tamu.edu
Fri Jan 11 23:41:13 CET 2013


If you really have a matrix to begin with, yes. But if you generated it from
dist() or its relations, you would have to convert it to a matrix (roughly
doubling the memory needed). The various hierarchical cluster functions
usually want a dist object.

> dm <- dist(x, diag=TRUE, upper=TRUE)
> str(dm)
Class 'dist'  atomic [1:45] 3.84 4.09 3.64 4.94 4.33 ...
  ..- attr(*, "Size")= int 10
  ..- attr(*, "Diag")= logi TRUE
  ..- attr(*, "Upper")= logi TRUE
  ..- attr(*, "method")= chr "euclidean"
  ..- attr(*, "call")= language dist(x = x, diag = TRUE, upper = TRUE)

In dist(), diag=TRUE and upper=TRUE refer only to how the matrix is
displayed. It is still stored as a single vector:

> round(dm, 3)
       1     2     3     4     5     6     7     8     9    10
1  0.000 3.843 4.094 3.643 4.935 4.328 4.288 6.205 6.197 2.181
2  3.843 0.000 5.085 5.171 5.067 3.788 4.384 5.770 7.113 2.830
3  4.094 5.085 0.000 3.571 4.548 4.103 3.532 3.917 6.470 3.734
4  3.643 5.171 3.571 0.000 3.821 3.843 3.667 5.513 5.176 3.294
5  4.935 5.067 4.548 3.821 0.000 4.815 3.465 5.918 6.138 4.764
6  4.328 3.788 4.103 3.843 4.815 0.000 2.794 3.937 5.475 3.023
7  4.288 4.384 3.532 3.667 3.465 2.794 0.000 4.075 5.251 4.010
8  6.205 5.770 3.917 5.513 5.918 3.937 4.075 0.000 5.511 5.152
9  6.197 7.113 6.470 5.176 6.138 5.475 5.251 5.511 0.000 6.168
10 2.181 2.830 3.734 3.294 4.764 3.023 4.010 5.152 6.168 0.000
> dm[1]
[1] 3.843183
> dm[2, 1]
Error in dm[2, 1] : incorrect number of dimensions

----------------------------------------------
David L Carlson
Associate Professor of Anthropology
Texas A&M University
College Station, TX 77843-4352




> -----Original Message-----
> From: David Winsemius [mailto:dwinsemius at comcast.net]
> Sent: Friday, January 11, 2013 4:21 PM
> To: dcarlson at tamu.edu
> Cc: 'eliza botto'; r-help at r-project.org
> Subject: Re: [R] locating element in distance matrix
> 
> 
> On Jan 11, 2013, at 1:51 PM, David L Carlson wrote:
> 
> > If you have a dist object (created by dist()) or if you used
> lower.tri(x) to
> > extract the lower triangle of the matrix, which() will not work since
> the
> > matrix is now stored as a numeric vector with n(n-1)/2 elements where
> n is
> > the number of rows/columns. In that case you must compute the
> original
> > row/column values from the position along the vector:
> >
> >> dwhich <- function(d, indx) {
> > +     i <- round((1+sqrt(1+8*length(d)))/2, 0)
> > +     rowd <- unlist(sapply(2:i, function(x) x:i))
> > +     cold <- rep(1:(i-1), (i-1):1)
> > +     return(data.frame(indx=indx, row=rowd[indx], col=cold[indx]))
> > + }
> 
> Wouldn't it be easier to leave the distance matrix structure intact and
> just make the diagonal and upper.tri positions Inf?
> 
> > dwhich <- function(d) {
> +      d[row(d) <= col(d)] <- Inf
> +       which(d == min(d,na.rm=FALSE), arr.ind=TRUE)
> +  }
> > dwhich(dm)
>    row col
> 10  10   1
> 
> --
> 
> >> set.seed(42)
> >> x <- matrix(rnorm(100), 10, 10)
> >> d <- dist(x)
> >> dm <- as.matrix(dist(x, diag=TRUE, upper=TRUE))
> >> dm <- dm[lower.tri(dm)]
> >> dwhich(d, which(d==min(d)))
> >  indx row col
> > 1    9  10   1
> >> dwhich(dm, which(dm==min(dm)))
> >  indx row col
> > 1    9  10   1
> >
> >
> >> -----Original Message-----
> >> From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-
> >> project.org] On Behalf Of David Winsemius
> >> Sent: Friday, January 11, 2013 12:37 PM
> >> To: eliza botto
> >> Cc: r-help at r-project.org
> >> Subject: Re: [R] locating element in distance matrix
> >>
> >>
> >> On Jan 11, 2013, at 9:55 AM, eliza botto wrote:
> >>
> >>>
> >>> Dear useRs,
> >>> I have a very basic question. I have a distance matrix and i
> skipped
> >>> the upper part of it deliberately.
> >>
> >> I have no idea what htat means. Code is always helpful in resolving
> >> ambiguities.
> >>
> >>> The distance matrix is 1000*1000.  Then i used "min" command to
> >>> extract the lowest value from that matrix. Now i want to know what
> >>> is the location of that lowest element? More precisely, the row and
> >>> column number of that lowest element.
> >>> Thanks in advance
> >>
> >> ?which
> >> which( distmat == min(distmat), arr.ind=TRUE)
> >>
> >> (It's possible to have more than one match and it would  be up to
> you
> >> to decide how to break ties.)
> >>
> >> --
> >>
> >> David Winsemius, MD
> >> Alameda, CA, USA
> >>
> >> ______________________________________________
> >> R-help at r-project.org mailing list
> >> https://stat.ethz.ch/mailman/listinfo/r-help
> >> PLEASE do read the posting guide http://www.R-project.org/posting-
> >> guide.html
> >> and provide commented, minimal, self-contained, reproducible code.
> >
> 
> David Winsemius
> Alameda, CA, USA




More information about the R-help mailing list