[R] question re: "summarry.lm" and NA values

Tue Aug 15 17:49:40 CEST 2006

"Is there a way to..." always has the answer "yes" in R (or C or any
language for that matter). The question is: "Is there a GOOD way...?" where
"good" depends on the specifics of the situation. So after that polemic,
below is an effort to answer, (adding to what Petr Pikal already said):

-- Bert Gunter
Genentech Non-Clinical Statistics
South San Francisco, CA

"The business of the statistician is to catalyze the scientific learning
process."  - George E. P. Box

> -----Original Message-----
> From: r-help-bounces at stat.math.ethz.ch 
> [mailto:r-help-bounces at stat.math.ethz.ch] On Behalf Of r user
> Sent: Tuesday, August 15, 2006 7:01 AM
> To: rhelp
> Subject: [R] question re: "summarry.lm" and NA values
> 
> Is there a way to get the following code to include
> NA values where the coefficients are "NA"?
> 
> ((summary(reg))$coefficients)
BAAAD! Don't so this. Use the extractor on the object: coef(reg) 
This suggests that you haven't read the documentation carefully, which tends
to arouse the ire of would-be helpers.

> 
> explanation:
> 
> Using a loop, I am running regressions on several
> "subsets" of "data1".
> 
> "reg <- ( lm(lm(data1[,1] ~., data1[,2:l])) )"
??? There's an error here I think. Do you mean update()? Do you have your
subscripting correct?

> 
> My regression has 10 independent variables, and I
> therefore expect 11 coefficients.
> After each regression, I wish to save the coefficients
> and standard errors of the coefficients in a table
> with 22 columns.
> 
> I successfully extract the coefficients using the
> following code:
> "reg$coefficients"
Use the extractor, coef()

> 
> I attempt to extract the standard errors using :
> 
> aperm((summary(reg))$coefficients)[2,]

BAAAD! Use the extractor vcov(): sqrt(diag(vcov(reg)))
> 
> ((summary(reg))$coefficients)
> 
> My problem:
> For some of my subsets, I am missing data for one or
> more of the independent variables.  This of course
> causes the coefficients and standard erros for this
> variable to be "NA".
Not it doesn't, as Petr said.

One possible approach: Assuming that a variable is actually missing (all
NA's), note that coef(reg) is a named vector, so that the character string
names of the regressors actually used are available. You can thus check for
what's missing and add them as NA's at each return. Though I confess that I
see no reason to put things ina matrix rather than just using a list. But
that's a matter of personal taste I suppose.

> 
> Is there a way to include the NA standard errors, so
> that I have the same number of standard erros and
> coefficients for each regression, and can then store
> the coefficients and standard erros in my table of 22
> columns?
> 
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide 
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>