[R] "[.data.frame" and lapply

Romain Francois romain.francois at dbmail.com
Thu Mar 26 08:46:08 CET 2009


Hi,

This is a bug I think. [.data.frame treats its arguments differently 
depending on the number of arguments.

 > d <- data.frame(x = rnorm(5), y = rnorm(5), z = rnorm(5) )
 > d[, 1:2]
             x           y
1   0.45141341  0.03943654
2  -0.87954548  1.83690210
3  -0.91083710  0.22758584
4   0.06924279  1.26799176
5  -0.20477052 -0.25873225
 > base:::`[.data.frame`( d, j=1:2)
             x           y          z
1   0.45141341  0.03943654 -0.8971957
2  -0.87954548  1.83690210  0.9083281
3  -0.91083710  0.22758584 -0.3104906
4   0.06924279  1.26799176  1.2625699
5  -0.20477052 -0.25873225  0.5228342
but also:
 > d[ j=1:2]
            x           y          z
1  0.45141341  0.03943654 -0.8971957
2 -0.87954548  1.83690210  0.9083281
3 -0.91083710  0.22758584 -0.3104906
4  0.06924279  1.26799176  1.2625699
5 -0.20477052 -0.25873225  0.5228342

`[.data.frame` only is called with two arguments in the second case, so 
the following condition is true:

if(Narg < 3L) {  # list-like indexing or matrix indexing

And then, the function assumes the argument it has been passed is i, and 
eventually calls NextMethod("[") which I think calls 
`[.listof`(x,i,...), since i is missing in `[.data.frame` it is not 
passed to `[.listof`, so you have something equivalent to as.list(d)[].

I think we can replace the condition with this one:

if(Narg < 3L && !has.j) {  # list-like indexing or matrix indexing
 
or this:

if(Narg < 3L) {  # list-like indexing or matrix indexing
        if(has.j) i <- j
  
 > `[.data.frame`(d, j=1:2)
            x           y
1  0.45141341  0.03943654
2 -0.87954548  1.83690210
3 -0.91083710  0.22758584
4  0.06924279  1.26799176
5 -0.20477052 -0.25873225

However, we would still have this, which is expected (same as d[1:2] ):

 > `[.data.frame`(d, i=1:2)
            x           y
1  0.45141341  0.03943654
2 -0.87954548  1.83690210
3 -0.91083710  0.22758584
4  0.06924279  1.26799176
5 -0.20477052 -0.25873225

Romain

baptiste auguie wrote:
> Dear all,
>
>
> Trying to extract a few rows for each element of a list of 
> data.frames, I'm puzzled by the following behaviour,
>
>
>> d <- lapply(1:4,  function(i) data.frame(x=rnorm(5), y=rnorm(5)))
>> str(d)
>>
>> lapply(d, "[", i= c(1)) # fine,  this extracts the first columns
>> lapply(d, "[", j= c(1, 3)) # doesn't do nothing ?!
>>
>> library(plyr)
>>
>> llply(d, "[", j= c(1, 3)) # same
>
>
> Am i misinterpreting the meaning of "j", which I thought was an 
> argument of the method "[.data.frame"?
>
>
>> args(`[.data.frame`)
>> function (x, i, j, drop = if (missing(i)) TRUE else length(cols) ==
>>    1)
>>
>
> Many thanks,
>
> baptiste
>
> _____________________________
>
> Baptiste Auguié
>
> School of Physics
> University of Exeter
> Stocker Road,
> Exeter, Devon,
> EX4 4QL, UK
>
> Phone: +44 1392 264187
>
> http://newton.ex.ac.uk/research/emag
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide 
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
>


-- 
Romain Francois
Independent R Consultant
+33(0) 6 28 91 30 30
http://romainfrancois.blog.free.fr




More information about the R-help mailing list