[R] why data.frame, mutate package and not lists

jeremiah rounds roundsjeremiah at gmail.com
Wed Sep 14 20:40:08 CEST 2016


"If you want to add variable to data.frame you have to use attach, detach.
Right?"

Not quite.  Use it like a list to add a variable to a data.frame

e.g.
df = list()
df$var1 = 1:10
df = as.data.frame(df)
df$var2 = 1:10
df[["var3"]] = 1:10
df
df = as.list(df)
df$var4 = 1:10
as.data.frame(df)

Ironically the primary reason to use a data.frame in my head is to signal
that you are thinking of your data as a row-oriented tabular storage.
 "Ironic" because in technical detail that is not a requirement to be a
data.frame, but when I reflect on the typical way a seasoned R programmer
approaches list and data.frames that is basically what they are
communicating.

I was going to post that a reason to use data.frames is to take advantages
of optimizations and syntax sugar for data.frames, but in reality if code
does not assume a row-oriented data structure in a data.frame there is not
much I can think of that exists in the way of optimization.  For example,
we could point to "subset" and say that is a reason to use data.frames and
not list, but that only works if you use data.frame in a conventional way.

In the end, my advice to you is if it is a table make it a data.frame and
if it is not easily thought of as a table or row-oriented data structure
keep it as a list.

Thanks,
Jeremiah





On Wed, Sep 14, 2016 at 11:15 AM, Alaios via R-help <r-help at r-project.org>
wrote:

> thanks for all the answers. I think also ggplot2 requires data.frames.If
> you want to add variable to data.frame you have to use attach, detach.
> Right?Any more links that discuss thoe two different approaches?Alex
>
>     On Wednesday, September 14, 2016 5:34 PM, Bert Gunter <
> bgunter.4567 at gmail.com> wrote:
>
>
>  This is partially a matter of subjectve opinion, and so pointless; but
> I would point out that data frames are the canonical structure for a
> great many of R's modeling and graphics functions, e.g. lm, xyplot,
> etc.
>
> As for mutate() etc., that's about UI's and user friendliness, and
> imho my ho is meaningless.
>
> Best,
> Bert
> Bert Gunter
>
> "The trouble with having an open mind is that people keep coming along
> and sticking things into it."
> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
>
>
> On Wed, Sep 14, 2016 at 6:01 AM, Alaios via R-help <r-help at r-project.org>
> wrote:
> > Hi all,I have seen data.frames and operations from the mutate package
> getting really popular. In the last years I have been using extensively
> lists, is there any reason to not use lists and use other data types for
> data manipulation and storage?
> > Any article that describe their differences? I would like to thank you
> for your replyRegardsAlex
> >        [[alternative HTML version deleted]]
> >
> > ______________________________________________
> > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/
> posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>
>
>
>         [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/
> posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

	[[alternative HTML version deleted]]



More information about the R-help mailing list