[R] Subsetting multiple rows of a data frame at once

William Dunlap wdunlap at tibco.com
Fri Jul 5 02:02:18 CEST 2013


> xt<- c(1.05, 2.85, 3.40, 4.25, 0.25, 3.05, 3.70, 0.20, 0.30, 0.70, 1.05, 1.20, 1.40, 1.90,
> 2.70, 3.25, 3.55, 4.60, 2.05, 2.15, 3.70, 4.85, 4.90, 1.60, 2.45, 3.20, 3.90, 4.45)
> 
> yt<- c(0.25, 0.10, 0.90, 0.25, 1.05, 1.70, 2.05, 2.90, 2.35, 2.60, 2.55, 2.15, 2.75, 2.05,
> 2.70, 2.25, 2.55, 2.05, 3.65, 3.05, 3.00, 3.50, 3.75, 4.85, 4.50, 4.50, 3.35, 4.90)
> carbon.fit = expand.grid(list(x=seq(0, 5, 0.01), y=seq(0, 5, 0.01)))
> trees<-do.call(rbind,lapply(seq_along(xt),function(i) subset(carbon.fit,x==xt[i]&y==yt[i])))
> 
> ## xt is 28 integers long and when i run the above code it only returns the values of 18
> out of the 28 (xt,yt) pairs that i want.

You are running into the problem that two different computational methods that give
the same result when applied to real numbers often give different results when applied
to 64-bit floating point numbers.  (In your case you expect seq(0,5,.01) to contain, e.g.,
the floating point number generate by parsing the string "3.05".)   Hence x==y is not true
when you expect it to be.  Here is where your 18 came from:
   R> table(xt %in% carbon.fit$x, yt %in% carbon.fit$y)
          
           FALSE TRUE
     FALSE     1    6
     TRUE      3   18
Round your number to the nearest 10^-10 and you get
  > table(round(xt,10) %in% round(carbon.fit$x,10), round(yt,10) %in% round(carbon.fit$y,10))
        
         TRUE
    TRUE   28

By the way, you may prefer using the merge() function rather than the do.call(rbind,lapply(...)))
business.  I think the following call to merge will do about what you want (the row names differ -
if they are important it is possible to get them with some minor trickery):
    merge(data.frame(x=xt,y=yt), carbon.fit)
(You still want to round your numbers as before.)

Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com

> -----Original Message-----
> From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On Behalf
> Of arun
> Sent: Wednesday, July 03, 2013 10:15 PM
> To: Shaun ♥ Anika
> Cc: R help
> Subject: Re: [R] Subsetting multiple rows of a data frame at once
> 
> Hi,
> 
> carbon.fit = expand.grid(list(x=seq(0, 5, 0.01), y=seq(0, 5, 0.01)))
>  dim(carbon.fit)
> #[1] 251001      2
> 
> 
>  xtNew<-sprintf("%.2f",xt)
>  ytNew<- sprintf("%.2f",yt)
>  carbon.fit[]<- lapply(carbon.fit,function(x) sprintf("%.2f",x))
> res<-do.call(rbind,lapply(seq_along(xtNew),function(i)
> subset(carbon.fit,x==xtNew[i]&y==ytNew[i])))
>  nrow(res)
> #[1] 28
> res
> #          x    y
> #12631  1.05 0.25
> #5296   2.85 0.10
> #45431  3.40 0.90
> #12951  4.25 0.25
> #52631  0.25 1.05
> #85476  3.05 1.70
> #103076 3.70 2.05
> #145311 0.20 2.90
> #117766 0.30 2.35
> #130331 0.70 2.60
> #127861 1.05 2.55
> #107836 1.20 2.15
> #137916 1.40 2.75
> #102896 1.90 2.05
> #135541 2.70 2.70
> #113051 3.25 2.25
> #128111 3.55 2.55
> #103166 4.60 2.05
> #183071 2.05 3.65
> #153021 2.15 3.05
> #150671 3.70 3.00
> #175836 4.85 3.50
> #188366 4.90 3.75
> #243146 1.60 4.85
> #225696 2.45 4.50
> #225771 3.20 4.50
> #168226 3.90 3.35
> #245936 4.45 4.90
> A.K.
> 
> 
> ________________________________
> From: Shaun ♥ Anika <pro_patto at hotmail.com>
> To: "smartpink111 at yahoo.com" <smartpink111 at yahoo.com>
> Sent: Thursday, July 4, 2013 12:08 AM
> Subject: RE: Subsetting multiple rows of a data frame at once
> 
> 
> 
> 
> Hi There,
> i can give you the data needed to perform this task...
> 
> library(akima)
> library(fields)
> 
> xt<- c(1.05, 2.85, 3.40, 4.25, 0.25, 3.05, 3.70, 0.20, 0.30, 0.70, 1.05, 1.20, 1.40, 1.90,
> 2.70, 3.25, 3.55, 4.60, 2.05, 2.15, 3.70, 4.85, 4.90, 1.60, 2.45, 3.20, 3.90, 4.45)
> 
> yt<- c(0.25, 0.10, 0.90, 0.25, 1.05, 1.70, 2.05, 2.90, 2.35, 2.60, 2.55, 2.15, 2.75, 2.05,
> 2.70, 2.25, 2.55, 2.05, 3.65, 3.05, 3.00, 3.50, 3.75, 4.85, 4.50, 4.50, 3.35, 4.90)
> 
> xs<- c(0.45, 1.05, 2.75, 3.30, 4.95, 0.40, 1.05, 2.30, 3.45, 4.60, 0.05, 1.95, 2.95, 3.70,
> 4.55, 0.75, 1.60, 2.10, 3.60, 4.90, 0.05, 1.35, 2.60, 3.40, 4.25)
> 
> ys<- c(0.45, 0.95, 0.75, 0.95, 0.10, 1.90, 1.45, 1.25, 1.45, 1.05, 2.85, 2.60, 2.05, 2.60,
> 2.55, 3.75, 3.30, 3.95, 3.45, 3.70, 4.95, 4.35, 4.55, 4.40, 4.95)
> 
> carbon<- c(1.43, 1.82, 1.40, 1.43, 1.96, 1.61, 1.91, 1.53, 1.17, 1.83, 2.43, 2.02, 1.66,
> 2.45, 2.46, 1.39, 1.10, 1.38, 1.91, 2.13, 1.88, 1.26, 2.15, 1.89, 1.69)
> 
> carbon.df=data.frame(x=xs,y=ys,z=carbon)
> carbon.loess= loess(z~x*y, data= carbon.df, degree= 2)
> carbon.fit = expand.grid(list(x=seq(0, 5, 0.01), y=seq(0, 5, 0.01)))
> z=predict(carbon.loess, newdata= carbon.fit)
> carbon.fit$Height=as.numeric(z)
> image.plot(seq(0,5,0.01,), seq(0,5,0.01), z, xlab = "", ylab="",main = "Carbon")
> 
> trees<-do.call(rbind,lapply(seq_along(xt),function(i) subset(carbon.fit,x==xt[i]&y==yt[i])))
> 
> ## xt is 28 integers long and when i run the above code it only returns the values of 18
> out of the 28 (xt,yt) pairs that i want.
> 
> thanks for your help!!
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.


More information about the R-help mailing list