[R] Re: glm and data.frames (PR#108)

Peter Dalgaard BSA p.dalgaard at biostat.ku.dk
Tue Feb 2 15:50:45 CET 1999


jlindsey at alpha.luc.ac.be writes:

> The following is a bug or feature in recent versions. Try the
> following with a clean workspace:
> 
> y <- cbind(rpois(20,4),rpois(20,4))
> df <- data.frame(y=y,x=rnorm(20))
> colnames(df)
> colnames(model.frame(y~x,data=df))
> 
> Now start over with a clean workspace and try
> 
> df <- data.frame(y=cbind(rpois(20,4),rpois(20,4)),x=rnorm(20))
> model.frame(y~x,data=df)
> glm(y~x,data=df,family=binomial)
> 
> I have hacked a fix, given in the patch below. It works for me but may
> break other code. I avoid data.frames like the plague so have little
> to test it on. I had to use them in glmm to communicate with glm in
> certain contexts but I now think I would have been much better off
> rewriting IWLS from scratch.

No patch is needed, data.frame just doesn't work like that:

> df <- data.frame(y=cbind(rpois(20,4),rpois(20,4)),x=rnorm(20))
> names(df)
[1] "y.1" "y.2" "x"  

> df <- data.frame(y=I(cbind(rpois(20,4),rpois(20,4))),x=rnorm(20))
> names(df)
[1] "y" "x"

I.e. data.frame normally expands matrix arguments in their constituent
columns. This is so that data.frame(m) works "as expected" when m is a
matrix. (Of course you're free to expect otherwise...) If df has no
component named "y", glm() and friends won't find it.

> I think it is extremely dangerous that glm with the data option looks
> for variables in the environment if it does not find them in the
> dataframe supplied instead of giving an error (although not really my
> problem because I never use that option except in glmm).

Yes, but... How else would you do things like the following?

dfr<-data.frame(y=exp(rnorm(100)))
try<-seq(-2,2,.1)+0.01
rss<-sapply(try,
    function(lambda)
	summary(lm((y^lambda - 1)/lambda ~ 1,dfr))$sigma
)
plot(try,rss)
-- 
   O__  ---- Peter Dalgaard             Blegdamsvej 3  
  c/ /'_ --- Dept. of Biostatistics     2200 Cph. N   
 (*) \(*) -- University of Copenhagen   Denmark      Ph: (+45) 35327918
~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk)             FAX: (+45) 35327907
-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request at stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._



More information about the R-help mailing list