[R] Variable passed to function not used in function in select=... in subset

Gavin Simpson gavin.simpson at ucl.ac.uk
Tue Nov 11 10:13:23 CET 2008


On Tue, 2008-11-11 at 09:49 +0100, Wacek Kusnierczyk wrote:
> Gabor Grothendieck wrote:
> >
> > Regarding the convenience it occurs in expressions like this:
> >
> >    iris2 <- subset(iris, select = - Species)
> >
> > to create a data frame without the Species column.
> >   
> 
> aha!  so what's you best guess about the result here:

I'm not sure I see too much of a problem here.

> 
> d = data.frame(a = 1)
> d$`-b` = 2
> names(d)
> # here we go
> 
> subset(d, select = -b)
> # to b or not to b?

but -b is not the name of the column; you explicitly called it `-b` and
you should refer to it as such. If you use "non-standard" names then
expect to do a bit more work.

> subset(d, select = `-b`)
  -b
1  2
> subset(d, select = - `-b`)
  a
1 1

> 
> b = "a"
> subset(d, select = -b)
> # tragedy

For this, I interpret it as not finding a column named b so tries to
evaluate:

> b = "a"
> `-`(b)
Error in -b : invalid argument to unary operator

`-` is a function remember.

If you want this to work you can use get()

> subset(d, select = - get(b))
  -b
1  2

> 
> d$b = 3
> subset(d, select = -b)
> # catharsis
> 
> (for whatever reason a user may choose to have a column named '-b')

Yes, but the user is warned about not using standard naming conventions
in the Introduction to R manual. You aren't stopped from using names
like `-b` but if you use them, you have to expect to work a little
harder.

Reading ?subset we have:

  select: expression, indicating columns to select from a data frame.

....

     For data frames, the 'subset' argument works on the rows.  Note
     that 'subset' will be evaluated in the data frame, so columns can
     be referred to (by name) as variables in the expression (see the
     examples).

which I think is reasonably explicit is it not? It explains why your
second example fails and why '- get(b)' doesn't, and also why your other
examples don't give you what you want. You aren't using the appropriate
'name'.

I'm sure we could all find aspects of R that don't work in exactly the
way we might preconceive or think of as being intuitive. But if it works
as documented then I don't see what the problem is unless i) you are
offering to rewrite the code to make it "work better", ii) that R Core
thinks any proposal "works better" and iii) in doing so it doesn't break
most of the R code out there in R itself or in add-on packages.

G

> 
> 
> vQ
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
-- 
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%
 Dr. Gavin Simpson             [t] +44 (0)20 7679 0522
 ECRC, UCL Geography,          [f] +44 (0)20 7679 0565
 Pearson Building,             [e] gavin.simpsonATNOSPAMucl.ac.uk
 Gower Street, London          [w] http://www.ucl.ac.uk/~ucfagls/
 UK. WC1E 6BT.                 [w] http://www.freshwaters.org.uk
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%



More information about the R-help mailing list