[R] indexing a subset dataframe

Jim Lemon jim at bitwrit.com.au
Mon Apr 16 11:37:10 CEST 2007


Mandy Barron wrote:
> Hello
> I am having problems indexing a subset dataframe, which was created
> as:
> 
>>waspsNoGV<-subset(wasps,site!="GV")
> 
> 
> Fitting a linear model revealed some data points which had high
> leverage, so I attempted to redo the regression without these data
> points:
> 
>>wasps.lm<-lm(r~Nt,data=waspsNoGV[-c(61,69,142),])
> 
> which resulted in a "subscript out of bounds" error.
> 
> I'm pretty sure the problem is that the data points identified in the
> regression as having high leverage were the row names carried over from
> the original dataframe which had 150 rows, but when I try to remove data
> point #142 from the subset dataframe this tries to reference by a
> numerical index but there are only 130 data points in the subset
> dataframe hence the "subscript out of bounds" message.  So I guess my
> question is how do I reference the data points to drop from the
> regression by name?
> 
Hi Mandy,
You're correct in that the old indices are no longer valid in the new 
dataframe. If you want to use the original indices (i.e. you can't just 
identify the new row indices in the new dataframe), you can do this:

waspsNoGV$oldindices<-which(wasps$site != "GV")
wasps.lm<-lm(r~Nt,
  data=waspsNoGV[-(wasps$oldindices %in% c(61,69,142))])

Jim



More information about the R-help mailing list