[R] meaning of glm(value ~ .,

David Winsemius dwinsemius at comcast.net
Fri Jun 19 16:20:50 CEST 2009


All of your points are accepted, and I also give you credit for  
reading the "formula" page better than I.


On Jun 19, 2009, at 10:08 AM, Gavin Simpson wrote:

> On Fri, 2009-06-19 at 09:24 -0400, David Winsemius wrote:
>> On Jun 19, 2009, at 9:00 AM, onyourmark wrote:
> <snip />
>>> means and also, I see
>>>
>>> data=crs$dataset[,c(1:59,922)]
>>>
>>> I have read that the data argument is optional here
>>> "an optional data frame, list or environment (or object coercible by
>>> as.data.frame to a data frame) containing the variables in the
>>> model. If not
>>> found in data, the variables are taken from environment(formula),
>>> typically
>>> the environment from which glm is called"
>>>
>>> when they say "data", is that meant to include the dependent
>>> variable as
>>> well.
>>
>> Yes.
>
> It has to be defined in 'data' or the environment of 'formula', so it
> depends on what the OP meant by "meant to include". You can include it
> in 'data' but don't have to.
>
>>
>>> In other words,
>>> in the above statement 'value' is the dependent variable and it is
>>> also
>>> column 922 in the data set.
>>> Is this correct?
>>
>> Yes.
>
> No - you can't say that it is variable 922, or even any of 1:59 or 922
> for the reasons mentioned above.
>
> set.seed(123)
> dat <- data.frame(A = rnorm(100), B = rnorm(100), C = rnorm(100))
> Y <- rpois(100, 2)
> mod <- glm(Y ~ ., data = dat[,c(1,3)], family = poisson)
> mod
>
> If all you have is this:
>
> mod <- glm(Y ~ ., data = dat[,c(1,3)], family = poisson)
>
> You can't say anything more about Y than that it is either in 'dat' or
> in the environment of 'formula ', which in this case is the global
> workspace.
> G
>

David Winsemius, MD
Heritage Laboratories
West Hartford, CT




More information about the R-help mailing list