[Rd] New vcov(*, complete=TRUE) etc -- coef(<lm>) vs coef(<aov>)
maechler at stat.math.ethz.ch
Tue Nov 7 22:48:07 CET 2017
>>>>> Martin Maechler <maechler at stat.math.ethz.ch>
>>>>> on Thu, 2 Nov 2017 21:59:00 +0100 writes:
>>>>> Fox, John <jfox at mcmaster.ca>
>>>>> on Thu, 14 Sep 2017 13:46:44 +0000 writes:
>> Dear Martin, I made three points which likely got lost
>> because of the way I presented them:
>> (1) Singularity is an unusual situation and should be
>> made more prominent. It typically reflects a problem with
>> the data or the specification of the model. That's not to
>> say that it *never* makes sense to allow singular fits
>> (as in the situations you mentions).
>> I'd favour setting singular.ok=FALSE as the default, but
>> in the absence of that a warning or at least a note. A
>> compromise would be to have a singular.ok option() that
>> would be FALSE out of the box.
>> Any changes would have to be made very carefully so as
>> not to create chaos.
> I for one, am too reluctant to want to change the default
>> That goes for the points below as well.
>> (2) coef() and vcov() behave inconsistently, which can be
>> problematic because one often uses them together in code.
> indeed; and I had agreed on that. As of today, in R-devel
> only they now behave compatibly. NEWS entry
> • The “default” ("lm" etc) methods of vcov() have
> gained new optional argument complete = TRUE which makes
> the vcov() methods more consistent with the coef() methods
> in the case of singular designs. The former behavior is
> now achieved by vcov(*, complete=FALSE).
>> (3) As you noticed in your second message, lm() has a
>> singular.ok argument and glm() doesn't.
> and that has been amended even earlier (a bit more than a
> month ago) in R-devel svn rev 73380 with NEWS entry
> • glm() and glm.fit get the same singular.ok=TRUE
> argument that lm() has had forever. As a consequence, in
> glm(*, method = <your_own>), user specified methods need
> to accept a singular.ok argument as well.
>> I'll take a look at the code for glm() with an eye
>> towards creating a patch, but I'm a bit reluctant to mess
>> with the code for something as important as glm().
> and as a matter of fact you did send me +- the R code part
> of that change.
> My current plan is to also add the 'complete = TRUE'
> option to the "basic" coef() methods, such that you also
> have consistent coef(*, complete=FALSE) and vcov(*,
> complete=FALSE) behaviors.
and indeed I had added the above a bit later.
However, to my surprise, I have now found that we have a
coef.aov() method -- completely undocumented which behaves *differently*:
where as the default coef() method which is called for lm(..)
results gives *all* coefficients, and gives NA for "aliased" ones,
the aov method *drops* the NA coefficients and has done so
"forever" (I've checked R version 1.1.1 of April 14, 2000).
vcov() on the other hand has not had a special "aov" method, but
treats aov() and lm() results the same... which means that in
R-devel the vcov() method for an aov() object uses
'complete=TRUE' and gives NA rows and columns for the aliased coefficients,
whereas coef.aov() removes all the NAs and gives only the
"non-aliased" coefficients. Consequently, in R-devel,
vcov(<aov>) and coef(<aov>) are *now* incoherent, whereas these
two *were* coherent before the change.
I propose to
1. continue the strategy to keep coef() back-compatible and
2. to *document* the "surprising" behavior of coef.aov()
3. introduce a vcov.aov() with complete=FALSE default
behavior which is compatile to the coef.aov() one [where I'd
also introduce the no-change 'complete=FALSE' argument].
Hmm... again, this has been more work and more implications than
originally optimistically assumed..
Opinions, caveats, other feedback -- are very welcome!
More information about the R-devel