[R] Base within reverses column order

Duncan Murdoch murdoch.duncan at gmail.com
Mon Apr 7 20:33:08 CEST 2014


On 05/04/2014 1:10 PM, Dan Murphy wrote:
> Thanks, Duncan. Using names is certainly the most reliable solution,
> but requires remembering to modify the "surrounding code" when
> enhancing what's within -- a bug risk source when passing on the code.
> Are you saying that an automatic reversal as follows may break in the
> future because R-devel may change the current behavior of 'within'?

I'm saying that there might already be cases where that code breaks.
> (If so, then that's the greater source of bug risk .. so back to
> 'names'.)
>
> ln <- length(foo)
> foo <- within(foo, {
>    bar <- whatever
>    other()
>    })
> foo <- foo[c(1:ln, length(foo):(ln+1))] # to reverse within's assumed
> backwards column order
>
> Is there a way to capture the names of new objects created within?

Sure, something ike this should work:

oldvars <- names(foo)
foo <- within(foo, {...})
newvars <- setdiff(names(foo), oldvars)

There might also be changes to the oldvars; within doesn't just add 
columns, it can modify existing ones.

Duncan Murdoch

>
>
> On Fri, Apr 4, 2014 at 10:55 AM, Duncan Murdoch
> <murdoch.duncan at gmail.com> wrote:
> > On 04/04/2014 1:32 PM, Dan Murphy wrote:
> >>
> >> I just noticed this annoyance, but I'm not the first one, apparently
> >> -- see
> >> http://lists.r-forge.r-project.org/pipermail/datatable-help/2012-May/001176.html
> >>
> >> The thread never answered the OP's question "Is this a bug?" so I
> >> assume the answer, unfortunately, is No.
> >>
> >> If not a bug, do users of within have a workaround to produce a result
> >> with columns as ordered within 'within'? I can think of a way using
> >> names and subset-with-select, but that seems unduly kludgy.
> >
> >
> > I wouldn't be surprised if it is not consistent about that.  It uses as.list
> > to convert an environment to a list, and that's where the reversal occurs:
> > but since environments are unordered collections of objects, you just happen
> > to be seeing an undocumented and unpromised property of the internal
> > implementation.
> >
> > If the order matters to you, then create your initial dataframe with the new
> > variables (set to NA, for example), or reorder it afterwards.  But generally
> > speaking even in a dataframe (which is an ordered collection of objects),
> > it's better to program in a way that doesn't make assumptions about the
> > order.  Columns have names, and you should use those.
> >
> > Duncan Murdoch




More information about the R-help mailing list