[R] Sampling the Distance Matrix
dwinsemius at comcast.net
Fri Sep 25 01:29:51 CEST 2015
On Sep 24, 2015, at 1:54 PM, Lorenzo Isella wrote:
> On Thu, Sep 24, 2015 at 01:30:02PM -0700, David Winsemius wrote:
>> On Sep 24, 2015, at 12:36 PM, Lorenzo Isella wrote:
>>> And thanks for your reply.
>>> Essentially, your script gets the job done.
>>> For instance, if I run
>>> mm <- cbind(5/(1:5), -2*sqrt(1:5))
>>> dst <- dist(mm)
>>> dst2 <- as.matrix(dst)
>>> diag(dst2) <- NA
>>> idx <- which(apply(dst2, 1, function(x) all(na.omit(x)>.9)))
>>> then it correctly detects the first two rows, where all the values are
>>> larger than 0.9.
>>> In other words, it detects the points that are at least 0.9 units away
>>> from *all* the other points.
>>> My other question (I did not realize this until I got your answer) is
>>> the following: I have the distance matrix of a set of N points.
>>> You gave me an algorithm two find all the points that are at least 0.9
>>> units away from any other points.
>>> However, in some cases, for me it is OK even a weaker condition: find
>>> a subset of k points (with k tunable) whose distance *from each other*
>>> is greater than 0.9 units (even if their distance from some other
>>> points may be smaller than 0.9).
>> If I understand ..... Make a matrix of unique combinations, then apply by rows to get the qualifying columns that satisfy the distance criterion:
>> mtxcomb <- combn(1:20, 5)
>> goodcls <- apply(mtxcomb , 2, function(idx) all( dist( cbind( x[idx], y[idx]) ) > 0.9))
>> mtxcomb [ , goodcls]
>> In my sample it was around 9% of the total 5 item combinations.
>> snipped a lot of output:
>> [,1440] [,1441]
>> [1,] 12 13
>> [2,] 13 16
>> [3,] 16 17
>> [4,] 19 19
>> [5,] 20 20
>>> dim( mtxcomb)
>>  5 15504
> Thanks for your reply.
> I think I am getting there, but when I run your commands, I get this
> error message
> Error in cbind(x[idx], y[idx]) : object 'x' not found
> Any idea why? Should I combine those 3 lines with something else?
No idea. I was running the setup that you asked for in your original message which you have now omitted from the mail chain.
Alameda, CA, USA
More information about the R-help