[R] data frame with nested data frame

David Winsemius dwinsemius at comcast.net
Fri Dec 24 00:11:56 CET 2010


On Dec 23, 2010, at 5:06 PM, Vadim Ogranovich wrote:

> Dear R-users,
>
> I am somewhat puzzled by how R treats data frames with nested data  
> frames.

Speaking as a fellow user, .... why? Why would we want dataframes  
inside dataframes? Why wouldn't lists of dataframes be more  
appropriate if you were hoping to use apply or <some other function> ?


> Below are a couple of examples, maybe someone could help explain  
> what the guiding logic here is.
>
> ## construct plain data frame
>> z <- data.frame(x=1)
>
> ## add a data frame member
>> z$y <- data.frame(a=1,b=2)

cbind.data.frame  (dispatched if the first argument to cbind is a  
dataframe) would give you another dataframe without the mess of having  
nesting.
 > cbind(z, b=2)
   x b
1 1 2

This is also the time to ask .... what is it that you are _really_  
trying to accomplish?

>
> ## puzzle 1: z is apparently different from a straightforward  
> construction of the 'same' object
>> all.equal(z, data.frame(x=1,y=data.frame(a=1,b=2)))
> [1] "Names: 1 string  
> mismatch"                                                        
> "Length mismatch: comparison on first 2 components"
> [3] "Component 2: Modes: list,  
> numeric"                                              "Component 2:  
> names for target but not for current"
> [5] "Component 2: Attributes: < Modes: list, NULL  
> >"                                 "Component 2: Attributes: < names  
> for target but not for current >"
> [7] "Component 2: Attributes: < Length mismatch: comparison on first  
> 0 components >" "Component 2: Length mismatch: comparison on first 1  
> components"

Yes. the second one is equivalent to passing just the list portions of  
the nameless data.frame and ignoring attributes.

>
> ## puzzle 2: could not rbind z
>> rbind.data.frame(z, z)
> Error in `row.names<-.data.frame`(`*tmp*`, value = c("1", "1")) :
>  duplicate 'row.names' are not allowed
> In addition: Warning message:
> non-unique value when setting 'row.names': '1'

That is a puzzle, I agree.
This succeeds:
z <- data.frame(x=1, y=2)
  rbind(z,z
#########
   x y
1 1 2
2 1 2

Perhaps a bug (... trying to add drop=FALSE had an amusing result:
 > rbind(z,z, drop=FALSE)
      x
1    1
2    1
drop 0

-- 
David
>
>> version
>               _
> platform       i386-pc-mingw32
> arch           i386
> os             mingw32
> system         i386, mingw32
> status
> major          2
> minor          9.1
> year           2009
> month          06
> day            26
> svn rev        48839
> language       R
> version.string R version 2.9.1 (2009-06-26)
>
>
> Thanks,
> Vadim
-- 

David Winsemius, MD
West Hartford, CT

 > sessionInfo()
R version 2.12.1 (2010-12-16)
Platform: x86_64-apple-darwin9.8.0/x86_64 (64-bit)

locale:
[1] en_US.UTF-8/en_US.UTF-8/C/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] splines   stats     graphics  grDevices utils     datasets   
methods   base

other attached packages:
[1] sos_1.3-0       brew_1.0-4      lattice_0.19-13

loaded via a namespace (and not attached):
[1] grid_2.12.1  tools_2.12.1



More information about the R-help mailing list