[R] manipulating (extracting) data from distance matrices

Jon Olav Skoien j.skoien at geo.uu.nl
Tue Jul 15 17:43:19 CEST 2008


Maybe

dmat<-dist(dat, method="euclidean",upper = TRUE,diag = TRUE)

can fix your problem with the triangular matrix?

Cheers
Jon

Michael Rennie wrote:
> Not really,
>
> I'd actually want
>
> f[4:6,4:6]
>
> to get comparisons of observations 4 to 6 only. And I'm still left
> with the upper triangular matrix. This is a problem since I want to
> sum the distances over the blocks that I am extracting.
>
> Then again, I could just divide the sum by two and get the answer....
>
> And, if I want to sum blocks comparing distances among two groups, say
>
> f[7:10,4:6]
>
> then I'm in the triangluar matrix and not crossing the diagonal
> anymore, so I should be okay.
>
> I think I may have my answer, but any other tips are more than welcome.
>
> Cheers,
>
> Mike
>
> On Tue, Jul 15, 2008 at 9:35 AM, stephen sefick <ssefick at gmail.com> wrote:
>   
>> how about this
>> f <- as.matrix(dmat)
>> f[,4:6]
>> #you get repeats but I think this is what you want
>>
>> On Tue, Jul 15, 2008 at 9:07 AM, Michael Rennie <mdrennie at gmail.com> wrote:
>>     
>>> Hi all,
>>>
>>> Does anyone have any tips for extracting chunks of data from a distance
>>> matrix?
>>>
>>> For instance, if one was interested in only a subset of distance
>>> comparisons (i.e., that of rows 4 thru 6, and no others), is there a
>>> simple way to pull that data out?
>>>
>>> >From some playing around with an example (below), I've been able to
>>> figure out that a distance matrix in R is stored as a single vector,
>>> running top to bottom and left to right, so if you know the size of
>>> your distance matrix, you can figure out which elements to query and
>>> stick them together using c().
>>>
>>> However, all this stuff is still indexed by the "labels" attribute.
>>> Does anyone know of a way to use that to pull out subsets from the
>>> distance matrix in a simpler manner than my example code below?
>>>
>>> ##############
>>> # ex_dist.R
>>> # example for
>>> # manipulating
>>> # distance matrices
>>> ####################
>>>
>>> set.seed<-12345
>>>
>>> a<-sample(20:40, 10)
>>> b<-sample(80:100, 10)
>>> c<-sample(0:40, 10)
>>>
>>> dat<-data.frame(a,b,c)
>>> dat
>>>
>>> dmat<-dist(dat, method="euclidean")
>>> dmat
>>>
>>> dmat[1:6] #vector that stores the distance matrix runs descending down
>>> columns, left to right
>>>
>>> #in a 10-element distance matrix, column lengths are 9,8,7,6....1
>>>
>>> #get comparisons of rows 1:4 (from dat) ONLY
>>> #top-left matrix will consist of top 3 of first column, top 2 of
>>> second col, top 1 or third col.
>>>
>>> topleft<-c(dmat[1:3],dmat[10:11],dmat[18])
>>> topleft
>>>
>>> #get comparisons of rows 9:10 (from dat) ONLY
>>> #bottom right 4
>>>
>>> bottomright<-c(dmat[8:9],dmat[16:17])
>>> bottomright
>>>
>>> #######end#####
>>>
>>> I'm sure there's a simpler way to do this using the labels of the
>>> distance matrix, but I can't see it. I've thought of converting it
>>> using as.matrix(), which would allow me to pull out particular rows,
>>> but I'm only interested in the triangluar matrix. Now, if there was a
>>> way to as.matrix(dmat) such that I got the bottom triangular matrix
>>> and zeros elsewhere, then I'd be in buisness. Any suggestions on how
>>> to pull that off would be helpful.
>>>
>>> I'm certainly interested in any tips or tricks anyone might have for
>>> working with distance matrices, or any material that people can point
>>> me towards.
>>>
>>> Cheers,
>>>
>>> Mike
>>>
>>> --
>>> Michael D. Rennie
>>> Ph.D. Candidate
>>> University of Toronto at Mississauga
>>> 3359 Missisagua Rd. N.
>>> Mississauga, ON L5L 1C6
>>> Ph: 905-828-5452 Fax: 905-828-3792
>>> www.utm.utoronto.ca/~w3rennie
>>>
>>> ______________________________________________
>>> R-help at r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide
>>> http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>>       
>>
>> --
>> Let's not spend our time and resources thinking about things that are so
>> little or so large that all they really do for us is puff us up and make us
>> feel like gods. We are mammals, and have not exhausted the annoying little
>> problems of being mammals.
>>
>> -K. Mullis
>>     
>
>
>
>



More information about the R-help mailing list