[R] FW: subset using noncontiguous variables by name (not index)

Muenchen, Robert A (Bob) muenchen at utk.edu
Mon Aug 27 17:54:44 CEST 2007


Thomas, that's a good point. I was thinking of anscombe[x1::y1] making
it clear which one, but you would then want just x1::y1 to have
unambiguous meaning on its own, which is impossible.

As for x1:xN, it's unambiguous on its own. I thought one of the great
advantages of R was that it could use different methods so that a new
operator would not be needed. The colon operator would just have a new
method for when stringN appeared. One that would be very useful & have
obvious meaning. 

Thanks,
Bob

> -----Original Message-----
> From: Thomas Lumley [mailto:tlumley at u.washington.edu]
> Sent: Monday, August 27, 2007 10:25 AM
> To: Muenchen, Robert A (Bob)
> Cc: r-help at stat.math.ethz.ch
> Subject: Re: [R] subset using noncontiguous variables by name (not
> index)
> 
> On Mon, 27 Aug 2007, Muenchen, Robert A (Bob) wrote:
> 
> > Gabor, That works great!
> >
> > I think this would be a very helpful addition to the main R
> > distribution. Perhaps with a single colon representing numerical
> order
> > (exactly as you have written it) and two colons representing the
> order
> > of the variables as they appear in the data frame (your first
> example).
> > That's analogous to SAS' x1-xN, which you know gets those N
> variables,
> > and a--z, which selects an unknown number of variables a through z.
> How
> > many that is depends upon their order in the data frame. That would
> not
> > only be very useful in general, but it would also make transitioning
> to
> > R from SAS or SPSS less confusing.
> >
> > Is R still being extended in such basic ways, or does that muck up
> > existing programs too much?
> >
> 
> In principle base R can be extended like that, but a strong case is
> needed
> for non-standard evaluation rules and for depleting the restricted
> supply
> of short binary operator names.
> 
> The reason for subset() and its behaviour is that 'variables as they
> appear the in data frame' is typically ambiguous -- which data frame?
> In
> SPSS you have only one and in SAS there is a default one, so there is
> no
> ambiguity in X1--Y2, but in R it needs another argument specifying the
> data frame, so it can't really be a binary operator.
> 
> The double colon :: and triple colon ::: are already used for
> namespaces,
> and a search of r-help reveals two previous, different, suggestions
for
> %:%.
> 
> 
>  	-thomas
> 
> Thomas Lumley			Assoc. Professor, Biostatistics
> tlumley at u.washington.edu	University of Washington, Seattle



More information about the R-help mailing list