[R] problems with subset (misunderstanding somewhere)

(Ted Harding) Ted.Harding at nessie.mcc.ac.uk
Tue Apr 5 15:29:22 CEST 2005


On 05-Apr-05 Wladimir Eremeev wrote:
> Dear r-help,
> 
> I have the following function defined:
> 
> cubic.distance<-function(x1,y1,z1,x2,y2,z2) {
>   max(c(abs(x1-x2),abs(y1-y2),abs(z1-z2)))
> }
> 
> I have a data frame from which I make subsets.
> 
> When I call
>   subset(dataframe,cubic.distance(tb19h,tb37v,tb19v,190,210,227)<=2)
> I have the result with 0 rows.
> 
> However, the data frame contains the row (among others, that suit)
> tb19v tb19h tb37v
> 226.6 189.3 208.4

Did you test the function cubic.distance? As written, I think it
will always return a single value, since max() returns the maximum
of *all* the values, not by rows (even if you use cbind() rather
than c()).

If yor redefine the function as

cubic.distance<-function(x1,y1,z1,x2,y2,z2) {
  apply(cbind(abs(x1-x2),abs(y1-y2),abs(z1-z2)),1,max)
}

I think you will find it does what you want (if I have
understood your problem correctly).

Example (with the function defined as above):

 x<-cbind(rnorm(10,190,1),rnorm(10,210,1),rnorm(10,227,1))
 x<-cbind(rnorm(10,190,2),rnorm(10,210,2),rnorm(10,227,2))
 colnames(x)<-c("tb19h","tb37v","tb19v")
 x.df<-as.data.frame(x)
 (cubic.distance(x.df$tb19h,x.df$tb37v,x.df$tb19v,190,210,227)<=2)
#[1] FALSE FALSE  TRUE  TRUE FALSE FALSE FALSE FALSE  TRUE  TRUE
 subset(x.df,cubic.distance(tb19h,tb37v,tb19v,190,210,227)<=2)
#      tb19h    tb37v    tb19v
#3  189.3930 211.4345 226.3436
#4  189.4521 208.8493 228.0324
#9  188.2441 210.4914 226.4521
#10 191.4781 211.5234 226.1837

With your definition, you would have got the *single"
result FALSE, since there is at least one case where
the distance > 2, so the max > 2, so the subset criterion
evaluates to FALSE, so no rows are selected.

Hoping this helps,
Ted.


--------------------------------------------------------------------
E-Mail: (Ted Harding) <Ted.Harding at nessie.mcc.ac.uk>
Fax-to-email: +44 (0)870 094 0861
Date: 05-Apr-05                                       Time: 14:29:22
------------------------------ XFMail ------------------------------




More information about the R-help mailing list