[Rd] Help with "row.names = as.integer(c(NA, 5))" in file from dput

Peter Dalgaard p.dalgaard at biostat.ku.dk
Wed Feb 28 21:52:02 CET 2007


Mike Prager wrote:
> I am trying to understand why syntax used by dput() to write
> rownames is valid (say, when read by dget()).  I ask this
> because I desire to emulate its actions *reliably* in my For2R
> routines, and I won't be comfortable until I understand what R
> is doing.
>
> Given data set "fred":
>
>   
>> fred
>>     
>     id      var1
> 1 1991 0.4388587
> 2 1992 0.8772471
> 3 1993 0.6230486
> 4 1994 0.2340929
> 5 1995 0.5005605
>
> we can try this--
>
>   
>> dput(ats, control = "all")
>>     
> structure(list(id = c(1991, 1992, 1993, 1994, 1995), var1 =
> c(0.4388587, 0.8772471, 0.6230486, 0.2340929, 0.5005605)),
> .Names = c("id", "var1"), row.names = as.integer(c(NA, 5)),
> class = "data.frame")
>
> In the above result, why is the following part valid?
>
> row.names = as.integer(c(NA, 5))
>
> given that the length of the RHS expression is 2, while the
> needed length is 5.
>
> Moreover, the following doesn't work:
>
>   
>> row.names(fred) <- as.integer(c(NA,5))
>>     
> Error in `row.names<-.data.frame`(`*tmp*`, value = c(NA, 5)) : 
>         invalid 'row.names' length
>
> Is there any reason why the expression
>
> c(NA,5) 
>
> is better here than the more natural
>
> 1:5 
>
> here?
>
>   
It's mainly a space-saving device. Originally, row.names was a character 
vector, but storage of character vectors is quite inefficient, so we now 
allow integer names and also a very short form where 1:n is stored just 
using the single value n. To distinguish the latter two, we use the 
c(NA, n) form, because row names are not allowed to be missing.

Consider the following and notice how the string row names take up 
roughly 36 bytes per  record where the actual data are only 8 bytes per 
record.

 > d<-data.frame(x=rnorm(1000))
 > object.size(d)
[1] 8392
 > row.names(d)<-as.character(1:1000)
 > object.size(d)
[1] 44384
 > row.names(d)<-1000:1
 > object.size(d)
[1] 12384
 > row.names(d)<-NULL
 > object.size(d)
[1] 8392




> I will appreciate help from anyone with time to reply.
>
> MHP
>
>



More information about the R-devel mailing list