[R] why data.frame, mutate package and not lists

jeremiah rounds roundsjeremiah at gmail.com
Wed Sep 14 20:41:29 CEST 2016


There is also this syntax for adding variables
df[, "var5"] = 1:10

and the syntax sugar for row-oriented storage:
df[1:5,]

On Wed, Sep 14, 2016 at 11:40 AM, jeremiah rounds <roundsjeremiah at gmail.com>
wrote:

> "If you want to add variable to data.frame you have to use attach, detach.
> Right?"
>
> Not quite.  Use it like a list to add a variable to a data.frame
>
> e.g.
> df = list()
> df$var1 = 1:10
> df = as.data.frame(df)
> df$var2 = 1:10
> df[["var3"]] = 1:10
> df
> df = as.list(df)
> df$var4 = 1:10
> as.data.frame(df)
>
> Ironically the primary reason to use a data.frame in my head is to signal
> that you are thinking of your data as a row-oriented tabular storage.
>  "Ironic" because in technical detail that is not a requirement to be a
> data.frame, but when I reflect on the typical way a seasoned R programmer
> approaches list and data.frames that is basically what they are
> communicating.
>
> I was going to post that a reason to use data.frames is to take advantages
> of optimizations and syntax sugar for data.frames, but in reality if code
> does not assume a row-oriented data structure in a data.frame there is not
> much I can think of that exists in the way of optimization.  For example,
> we could point to "subset" and say that is a reason to use data.frames and
> not list, but that only works if you use data.frame in a conventional way.
>
> In the end, my advice to you is if it is a table make it a data.frame and
> if it is not easily thought of as a table or row-oriented data structure
> keep it as a list.
>
> Thanks,
> Jeremiah
>
>
>
>
>
> On Wed, Sep 14, 2016 at 11:15 AM, Alaios via R-help <r-help at r-project.org>
> wrote:
>
>> thanks for all the answers. I think also ggplot2 requires data.frames.If
>> you want to add variable to data.frame you have to use attach, detach.
>> Right?Any more links that discuss thoe two different approaches?Alex
>>
>>     On Wednesday, September 14, 2016 5:34 PM, Bert Gunter <
>> bgunter.4567 at gmail.com> wrote:
>>
>>
>>  This is partially a matter of subjectve opinion, and so pointless; but
>> I would point out that data frames are the canonical structure for a
>> great many of R's modeling and graphics functions, e.g. lm, xyplot,
>> etc.
>>
>> As for mutate() etc., that's about UI's and user friendliness, and
>> imho my ho is meaningless.
>>
>> Best,
>> Bert
>> Bert Gunter
>>
>> "The trouble with having an open mind is that people keep coming along
>> and sticking things into it."
>> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
>>
>>
>> On Wed, Sep 14, 2016 at 6:01 AM, Alaios via R-help <r-help at r-project.org>
>> wrote:
>> > Hi all,I have seen data.frames and operations from the mutate package
>> getting really popular. In the last years I have been using extensively
>> lists, is there any reason to not use lists and use other data types for
>> data manipulation and storage?
>> > Any article that describe their differences? I would like to thank you
>> for your replyRegardsAlex
>> >        [[alternative HTML version deleted]]
>> >
>> > ______________________________________________
>> > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> > https://stat.ethz.ch/mailman/listinfo/r-help
>> > PLEASE do read the posting guide http://www.R-project.org/posti
>> ng-guide.html
>> > and provide commented, minimal, self-contained, reproducible code.
>>
>>
>>
>>         [[alternative HTML version deleted]]
>>
>> ______________________________________________
>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posti
>> ng-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
>

	[[alternative HTML version deleted]]



More information about the R-help mailing list