[R] str(data.frame) after subsetting reflects original structure, not subsetted structure?

Ben Bolker bolker at ufl.edu
Fri Jul 24 15:40:15 CEST 2009




Bryan Hanson wrote:
> 
> I find that after subsetting (you may prefer "conditional selection") a
> data
> frame and assigning it to a new object, the str(new object) reflects the
> original data frame, not the new one:
> 
> A <- rnorm(20)
> B <- factor(rep(c("t", "g"), 10))
> C <- factor(rep(c("h", "l"), 10))
> D <- data.frame(A, B, C)
> 
> str(D) # reports correctly
> 
> E <- D[D$C == "h",]
> 
> str(E) # reports that D$C still has 2 levels, but
> E # or E$C shows that subsetting worked properly
> Summary(E) # shows the original structure and that subsetting worked
> 
> Is this the expected behavior, and if so, is there a particular rationale?
> I would be pretty certain that the information about E was inherited from
> D,
> but why wasn't it updated to reflect the revised object?  Is there an
> argument that I can use to force the updating?
> 
> For better or worse, I use str() a lot to check my work, and in this case,
> it seems to have misled me.
> 
> 

This is a FAQ, but not one that's documented (I think).

subset() does not drop unused levels.

If you try table(E$C) you will see that there are no "l"
values left:

 h  l 
10  0 

  E$C <- factor(E$C)

or

  E$C <- E$C[drop=TRUE]

or

library(gdata)
E <- drop.levels(E)

  will all work.

RSiteSearch("subset drop",restrict=c("Rhelp02","Rhelp08"))

will get you lots of information (perhaps more than you want) on
the pros and cons of this design decision ...

  Ben Bolker


-- 
View this message in context: http://www.nabble.com/str%28data.frame%29-after-subsetting-reflects-original-structure%2C-not-subsetted-structure--tp24644407p24644727.html
Sent from the R help mailing list archive at Nabble.com.




More information about the R-help mailing list