[R] manipulating (extracting) data from distance matrices

Michael Rennie mdrennie at gmail.com
Tue Jul 15 15:07:09 CEST 2008


Hi all,

Does anyone have any tips for extracting chunks of data from a distance matrix?

For instance, if one was interested in only a subset of distance
comparisons (i.e., that of rows 4 thru 6, and no others), is there a
simple way to pull that data out?

>From some playing around with an example (below), I've been able to
figure out that a distance matrix in R is stored as a single vector,
running top to bottom and left to right, so if you know the size of
your distance matrix, you can figure out which elements to query and
stick them together using c().

However, all this stuff is still indexed by the "labels" attribute.
Does anyone know of a way to use that to pull out subsets from the
distance matrix in a simpler manner than my example code below?

##############
# ex_dist.R
# example for
# manipulating
# distance matrices
####################

set.seed<-12345

a<-sample(20:40, 10)
b<-sample(80:100, 10)
c<-sample(0:40, 10)

dat<-data.frame(a,b,c)
dat

dmat<-dist(dat, method="euclidean")
dmat

dmat[1:6] #vector that stores the distance matrix runs descending down
columns, left to right

#in a 10-element distance matrix, column lengths are 9,8,7,6....1

#get comparisons of rows 1:4 (from dat) ONLY
#top-left matrix will consist of top 3 of first column, top 2 of
second col, top 1 or third col.

topleft<-c(dmat[1:3],dmat[10:11],dmat[18])
topleft

#get comparisons of rows 9:10 (from dat) ONLY
#bottom right 4

bottomright<-c(dmat[8:9],dmat[16:17])
bottomright

#######end#####

I'm sure there's a simpler way to do this using the labels of the
distance matrix, but I can't see it. I've thought of converting it
using as.matrix(), which would allow me to pull out particular rows,
but I'm only interested in the triangluar matrix. Now, if there was a
way to as.matrix(dmat) such that I got the bottom triangular matrix
and zeros elsewhere, then I'd be in buisness. Any suggestions on how
to pull that off would be helpful.

I'm certainly interested in any tips or tricks anyone might have for
working with distance matrices, or any material that people can point
me towards.

Cheers,

Mike

--
Michael D. Rennie
Ph.D. Candidate
University of Toronto at Mississauga
3359 Missisagua Rd. N.
Mississauga, ON L5L 1C6
Ph: 905-828-5452 Fax: 905-828-3792
www.utm.utoronto.ca/~w3rennie



More information about the R-help mailing list