[R] Distance calculation

arun smartpink111 at yahoo.com
Sat May 25 20:47:03 CEST 2013


Hi,

Try this:
mat1: 1st matrix
mat2: 2nd matrix

fun1<- function(x){
    big<- x>0.8*max(x) 
    n<- length(big)
    startRunOfBigs<- which(c(big[1],!big[-n] & big[-1]))
    endRunOfBigs<- which(c(big[-n] & !big[-1], big[n]))
    index<- vapply(seq_along(startRunOfBigs),function(i) which.max(x[startRunOfBigs[i]:endRunOfBigs[i]])+startRunOfBigs[i]-1L,0L)
     index<-ifelse(sum(is.na(match(index,c(1,12))))==0 & x[index]!=max(x[index]), NA,index)
    data.frame(Index=index[!is.na(index)],Value=x[index[!is.na(index)]])
     }

fun2<- function(mat){
       vec1<- sapply(seq_len(ncol(mat)),function(i){
         x<-mat[,i]
        x1<-fun1(mat[,i])
         x2<- x1$Index[which.min(abs(x1$Index-length(x)))]
        })
      
      indx<- if(abs(diff(vec1)) > (nrow(mat)/2)){
                nrow(mat)-abs(diff(vec1))
        }
             else(abs(diff(vec1)))
       res1<-sapply(seq(indx),function(i){
        x3<- mat[,2]
        indx1<-seq(length(x3)-i)
        indx2<-c(setdiff(seq_along(x3),indx1),indx1)
        sum(abs(mat[,1]-x3[indx2]))
        })                   
        res2<- sum(res1)
    res2                
}
     

fun2(mat1)
#[1] 40.93609
 fun2(mat2)
#[1] 12.79153


#Using the method that I send yesterday: (just to confirm the calculation for mat1.  It won't work with mat2)

sum(sapply(which.max(rev(mat1[,2])):which.max(mat1[,1]),function(i) {x<-seq(nrow(mat1))-i; x[x<=0]<-seq(max(x)+1,nrow(mat1),1);sum(abs(mat1[,1]-mat1[x,2]))}))
#[1] 40.93609


A.K.





----- Original Message -----
From: eliza botto <eliza_botto at hotmail.com>
To: "r-help at r-project.org" <r-help at r-project.org>
Cc: 
Sent: Thursday, May 23, 2013 4:29 PM
Subject: [R] Distance calculation

Dear useRs,
i have the following data arranged in three columns

structure(c(0.492096635764151, 0.433332688044914, 0.521585941816778, 1.66472272302545, 2.61878329527404, 2.19154489521664, 0.493876245329722, 0.4915787202584, 0.889477365620806, 0.609135860199222, 0.739201878930367, 0.854663750519518, 2.06195904001605, 1.41493262330451, 1.35748791897328, 1.19490680241894, 0.702488756183322, 0.338258418490199, 0.123398398622741, 0.138548982660226, 0.16170889185798, 0.414543218677095, 1.84629295875002, 2.24547399004563), .Dim = c(12L, 2L))

The distance is to be calculated by subtracting each value of each column from the corresponding column value in the following way
=>The column values are cyclic. For example, after row 12 there is once again row 1. So, in a way, row 3 is more closer to row 12 than to row 8. 
=> The peak value is the maximum value for any column. the values falling in the range of 80% of the maximum values can also be considered as maximum value provided they are not falling immediatly next to eachother. 
=> If we plot column 1 and column 2 the peak value of column 1 is at 5th grade of x-axis and for column 2 its in 12th. For column 2 at x=1 the value is very close to that of the value at x=12 (in 80% range of it), but still it can considered as peak value as it is immediatly falling next to the maximum value. Now The peaks are moved towards eachother in a shortest possible way unless maximum values are under eachother
more precisely,
column 1 
1 2 3 4 5(max) 6 7 8 9 10 11 12        column 2 
1 2 3 4 5 6 7 8 9 10 11 12(max)
Now distance is measured in the following way
column 1 
1 2 3 4 5(max) 6 7 8 9 10 11 12        column 2 
12(max) 1 2 3 4 5 6 7 8 9 10 11 
a>sum(abs(col1-col2))
==column 1 
1 2 3 4 5(max) 6 7 8 9 10 11 12        column 2 
11 12(max) 1 2 3 4 5 6 7 8 9 10  
b>sum(abs(col1-col2))==column 1 
1 2 3 4 5(max) 6 7 8 9 10 11 12        column 2 
10 11 12(max) 1 2 3 4 5 6 7 8 9 
c>sum(abs(col1-col2))==column 1 
1 2 3 4 5(max) 6 7 8 9 10 11 12        column 2 
9 10 11 12(max) 1 2 3 4 5 6 7 8 
d>sum(abs(col1-col2))==column 1 
1 2 3 4 5(max) 6 7 8 9 10 11 12        column 2 
8 9 10 11 12(max) 1 2 3 4 5 6 7 
e>sum(abs(col1-col2))

total distance= a+b+c+d+e
For the following two column it should work the following way

structure(c(0.948228727226247, 1.38569091844218, 0.910510759802679, 1.25991218521949, 0.993123416952421, 0.553640392997634, 0.357487763503204, 0.368328033777003, 0.344255688489322, 0.423679560916755, 1.32093576037521, 3.13420679229785, 0.766278117577654, 0.751997501086888, 0.836280758630117, 1.188156460303, 1.56771616670373, 1.15928168139479, 0.522523036011874, 0.561678840701488, 1.11155735914479, 1.26467106348848, 1.09378883406298, 1.17607018089421), .Dim = c(12L, 2L))
column 1 
1 2 3 4 5 6 7 8 9 10 11 12(max)        column 2 
1 2 3 4 5(max) 6 7 8 9 10(max) 11 12
Now as for column 2, 10th value is closer to col1 maximum value, therefore distance is measured in the following way
column 1 
1 2 3 4 5 6 7 8 9 10 11 12(max)        column 2 
12 1 2 3 4 5 6 7 8 9 10(max) 11
a>sum(abs(col1-col2))
---
column 1 
1 2 3 4 5 6 7 8 9 10 11 12(max)        column 2 
11 12 1 2 3 4 5 6 7 8 9 10(max) 
b>sum(abs(col1-col2))
total distance=a+b
How can i do it??
Thankyou very very much in advance
Elisa
                          
______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.




More information about the R-help mailing list