model.frame mangles time series (PR#121)

Thomas Lumley thomas@biostat.washington.edu
Thu, 18 Feb 1999 08:24:37 -0800 (PST)


On Thu, 18 Feb 1999 pd@biostat.ku.dk wrote:

> This one showed up while looking at one of Ripley's other reports:

> The upshot of this is that glm(...,subset=...) fails on the freeny data.
> 
> The cause is seen by 
> 
> > model.frame(y~1,data=freeny[1:10,])$y
>          Qtr1    Qtr2    Qtr3    Qtr4
> 1962:      NA 8.79236 8.79137 8.81486
> 1963: 8.81301 8.90751 8.93673 8.96161
> 1964: 8.96044 9.00868 9.03049      NA
> > dput(model.frame(y~1,data=freeny,subset=1:10)$y)
> structure(c(8.79236, 8.79137, 8.81486, 8.81301, 8.90751, 8.93673, 
> 8.96161, 8.96044, 9.00868, 9.03049), .Tsp = c(1962.25, 1971.75, 
> 4), class = "ts")
> 
> Notice that the .Tsp attribute doesn't reflect the shorter time series
> (1971.75 should be 1964.5).


This is the attributes/subsetting problem rearing its ugly head again.
model.frame(,subset) has to copy *some* attributes over (eg contrasts) and
can't copy *all* of them (eg dim).

Currently we use copyMostAttributes to drop the dangerous ones. It
clearly didn't know about tsp. 

Questions:
(1) How do we know which attributes to copy?
(2) For an unknown attribute what should the default be?
(3) Is there just one set of attributes that needs special treatment (in
which case copyMostAttributes is broken) or are there different sets in
different circumstances (in which case we need a new function)?
(4) Could we just handle this by having the subset operator retain
attributes?

	-thomas

-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-devel mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-devel-request@stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._