[R] is this a bug?

William Dunlap wdunlap at tibco.com
Sat Jun 18 17:48:48 CEST 2011


> -----Original Message-----
> From: r-help-bounces at r-project.org 
> [mailto:r-help-bounces at r-project.org] On Behalf Of Albert-Jan Roskam
> Sent: Saturday, June 18, 2011 2:44 AM
> To: Brian Diggs; R-help at r-project.org
> Subject: Re: [R] is this a bug?
> 
> Thanks a lot to all who responded. This is a little less 
> confusing now, although 
> it's hard for me to fathom the (practical) use of a dataframe 
> within a 
> dataframe. If one mixes different notations, or, put in a 
> different way, 
> different underlying classes (data.frame vs. numeric), these 
> rather unintuitive 
> results appear.
> So I'll use any of these:
> df$pct <- df$weight / ave(df$weight, df$sex, FUN=sum)*100
> df["pct"] <- df["weight"] / ave(df["weight"], df["sex"], FUN=sum)*100
> 
> using str() is very insightful, as is using class()
> 
> I'd prefer it if R simply generated an error when one 
> attempts to nest a 
> data.frame within a data.frame.

model.frame(), a workhorse function called from lm() and many
other modelling functions, commonly puts matrices inside of its
data.frame
output.  This helps keep together related columns of contrasts of
factor variables or things like the output of poly() or Surv().
Offhand, I cannot think of functions used in formulae that produce
data.frames, but once you allow matrices you may as well allow any
potentially multicolumned object in a data.frame.

Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com  

> 
> Thanks again!
> 
>  Cheers!!
> Albert-Jan
> 
> 
> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> All right, but apart from the sanitation, the medicine, 
> education, wine, public 
> order, irrigation, roads, a fresh water system, and public 
> health, what have the 
> Romans ever done for us?
> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> 
> 
> 
> 
> ________________________________
> From: Brian Diggs <diggsb at ohsu.edu>
> To: R-help at r-project.org
> Sent: Fri, June 17, 2011 11:58:44 PM
> Subject: Re: [R] is this a bug?
> 
> On 6/17/2011 2:24 PM, (Ted Harding) wrote:
> > And the extra twist in the tale is exemplified by this
> > mini-version of Albert-Jan's first example:
> >
> >    DF<- data.frame(A=c(1,2,3))
> >    DF$B<- c(4,5,6)
> >    DF$C<- c(7,8,9)
> >    DF
> >    #   A B C
> >    # 1 1 4 7
> >    # 2 2 5 8
> >    # 3 3 6 9
> >
> >    DF$D<- DF["A"]/DF["B"]
> >    DF
> >    #   A B C    A
> >    # 1 1 4 7 0.25
> >    # 2 2 5 8 0.40
> >    # 3 3 6 9 0.50
> >
> > ##And why:
> >
> >    DF["A"]/DF["B"]
> >    #      A
> >    # 1 0.25
> >    # 2 0.40
> >    # 3 0.50
> >
> > ##So the ratio DF["A"]/DF["B"] comes out with the name of
> > ##the numerator, "A". This is then the name given to DF$D
> 
> It's even slightly weirder than that:
> 
> str(DF)
> #'data.frame':   3 obs. of  4 variables:
> # $ A: num  1 2 3
> # $ B: num  4 5 6
> # $ C: num  7 8 9
> # $ D:'data.frame':      3 obs. of  1 variable:
> #  ..$ A: num  0.25 0.4 0.5
> 
> There is a column D in DF which is itself a data frame with a single 
> column whose name is A (because of what Ted said).  When 
> formatted for 
> printing out, the column name of the inner data frame is used (as a 
> result of how data.frame() itself handles named arguments when the 
> argument is itself a data.frame: "If a list or data frame or 
> matrix is 
> passed to data.frame it is as if each component or column had been 
> passed as a separate argument...").
> 
> So not a bug, but a convoluted set of circumstances that can 
> happen when 
> non-atomic vectors are assigned to columns of a data.frame.  
> That's one 
> of those /you shouldn't do that even though it is technically 
> legal or 
> at least you shouldn't be surprised when things don't work 
> the way you 
> thought they would/ things.
> 
> > Thus Albert-Jan's
> >    df["weight"] / ave(df["weight"], df["sex"], FUN=sum)*100
> > comes through with name "weight".
> >
> > Ted.
> >
> >
> > On 17-Jun-11 21:06:42, William Dunlap wrote:
> >> df$varname is a column of df.
> >>
> >> df["varname"] is a one-column df containing that column.
> >>
> >> df[["varname"]] is a column of df (same as df$varname).
> >>
> >> df[,"varname"] is a column of df (same as df$varname).
> >>
> >> df[,"varname",drop=FALSE] is a one-column df (same as df$varname).
> >>
> >> df$newVarname<- df["varname"] inserts a new component
> >> into df, the component being a one-column data.frame,
> >> not the column in that data.frame.
> >>
> >> Bill Dunlap
> >> Spotfire, TIBCO Software
> >> wdunlap tibco.com
> >>
> >>> -----Original Message-----
> >>> From: r-help-bounces at r-project.org
> >>> [mailto:r-help-bounces at r-project.org] On Behalf Of 
> Albert-Jan Roskam
> >>> Sent: Friday, June 17, 2011 1:49 PM
> >>> To: R Mailing List
> >>> Subject: [R] is this a bug?
> >>>
> >>> Hello,
> >>>
> >>> Is the following a bug? I always thought that df$varname<-
> >>> does the same as
> >>> df["varname"]<-
> >>>
> >>>> df<- data.frame(weight=round(runif(10, 10, 100)),
> >>> sex=round(runif(100, 0,
> >>> 1)))
> >>>> df$pct<- df["weight"] / ave(df["weight"], df["sex"], FUN=sum)*100
> >>>> names(df)
> >>> [1] "weight" "sex"    "pct"     ### ---------->  ok
> >>>> head(df)
> [[elided Yahoo spam]]
> >>> 1     86   0 2.4002233
> >>> 2     19   1 0.5643006
> >>> 3     32   0 0.8931063
> >>> 4     87   0 2.4281328
> >>> 5     45   0 1.2559308
> >>> 6     95   0 2.6514094
> >>>> rm(df)
> >>>> df<- data.frame(weight=round(runif(10, 10, 100)),
> >>> sex=round(runif(100, 0,
> >>> 1)))
> >>>> df["pct"]<- df["weight"] / ave(df["weight"], df["sex"],
> >>> FUN=sum)*100 ###
> >>>> ----->  this does work
> >>>> names(df)
> >>> [1] "weight" "sex"    "pct"
> >>>> head(df)
> >>>    weight sex       pct
> >>> 1     15   0 0.5246590
> >>> 2     43   0 1.5040224
> >>> 3     17   1 0.9284544
> >>> 4     44   1 2.4030584
> >>> 5     76   1 4.1507373
> >>> 6     59   0 2.0636586
> >>>> do.call(c, R.Version())
> >>>                         platform                            arch
> >>>              "i686-pc-linux-gnu"                          "i686"
> >>>                               os                          system
> >>>                      "linux-gnu"               "i686, linux-gnu"
> >>>                           status                           major
> >>>                               ""                             "2"
> >>>                            minor                            year
> >>>                           "11.1"                          "2010"
> >>>                            month                             day
> >>>                             "05"                            "31"
> >>>                          svn rev                        language
> >>>                          "52157"                             "R"
> >>>                   version.string
> >>> "R version 2.11.1 (2010-05-31)"
> >>>> # Thanks!
> >>>
> >>> Cheers!!
> >>> Albert-Jan
> >>>
> >>>
> >>> 
> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> >>> All right, but apart from the sanitation, the medicine,
> >>> education, wine, public
> >>> order, irrigation, roads, a fresh water system, and public
> >>> health, what have the
> >>> Romans ever done for us?
> >>> 
> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> >>>
> >>>       [[alternative HTML version deleted]]
> >>>
> >>> ______________________________________________
> >>> R-help at r-project.org mailing list
> >>> https://stat.ethz.ch/mailman/listinfo/r-help
> >>> PLEASE do read the posting guide
> >>> http://www.R-project.org/posting-guide.html
> >>> and provide commented, minimal, self-contained, reproducible code.
> >>>
> >>
> >> ______________________________________________
> >> R-help at r-project.org mailing list
> >> https://stat.ethz.ch/mailman/listinfo/r-help
> >> PLEASE do read the posting guide
> >> http://www.R-project.org/posting-guide.html
> >> and provide commented, minimal, self-contained, reproducible code.
> >
> > --------------------------------------------------------------------
> > E-Mail: (Ted Harding)<ted.harding at wlandres.net>
> > Fax-to-email: +44 (0)870 094 0861
> > Date: 17-Jun-11                                       Time: 22:24:41
> > ------------------------------ XFMail ------------------------------
> >
> 
> 
> -- 
> Brian S. Diggs, PhD
> Senior Research Associate, Department of Surgery
> Oregon Health & Science University
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide 
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
> 
> 	[[alternative HTML version deleted]]
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide 
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
> 



More information about the R-help mailing list