[R] Regression with factor having1 level
peter dalgaard
pdalgd at gmail.com
Sat Mar 12 00:57:23 CET 2016
> On 11 Mar 2016, at 23:48 , David Winsemius <dwinsemius at comcast.net> wrote:
>
>>
>> On Mar 11, 2016, at 2:07 PM, peter dalgaard <pdalgd at gmail.com> wrote:
>>
>>
>>> On 11 Mar 2016, at 17:56 , David Winsemius <dwinsemius at comcast.net> wrote:
>>>
>>>>
>>>> On Mar 11, 2016, at 12:48 AM, peter dalgaard <pdalgd at gmail.com> wrote:
>>>>
>>>>
>>>>> On 11 Mar 2016, at 08:25 , David Winsemius <dwinsemius at comcast.net> wrote:
>>>>>>
>>>> ...
>>>>>>> dfrm <- data.frame(y=rnorm(10), x1=rnorm(10) ,x2=as.factor(TRUE), x3=rnorm(10))
>>>>>>> lm(y~x1+x2+x3, dfrm, na.action=na.exclude)
>>>>>> Error in `contrasts<-`(`*tmp*`, value = contr.funs[1 + isOF[nn]]) :
>>>>>> contrasts can be applied
>>>>>
>>>>> Yes, and the error appears to come from `model.matrix`:
>>>>>
>>>>>> model.matrix(y~x1+factor(x2)+x3, dfrm)
>>>>> Error in `contrasts<-`(`*tmp*`, value = contr.funs[1 + isOF[nn]]) :
>>>>> contrasts can be applied only to factors with 2 or more levels
>>>>>
>>>>
>>>> Actually not. The above is because you use an explicit factor(x2). The actual smoking gun is this line in lm()
>>>>
>>>> mf$drop.unused.levels <- TRUE
>>>
>>> It's possible that modifying model.matrix to allow single level factors would then bump up against that check, but at the moment the traceback() from an error generated with data that has a single level factor and no call to factor in the formula still implicates code in model.matrix:
>>
>> You're missing the point: model.matrix has a beef with 1-level factors, not with 2-level factors of which one level happens to be absent, which is what this thread was originally about. It is lm that via model.frame with drop.unused.levels=TRUE converts the latter factors to the former.
>>
>
> I guess I did miss the point. Apologies for being obtuse. I thought that a one level factor would have been "aliased out" when model.matrix "realized" that it was collinear with the intercept. (Further apologies for my projection of cognitive capacites on a machine.) Are you saying it remains desirable that an error be thrown rather than reporting an NA for coefficients and issuing a warning?
>
For the moment I was just analyzing where this came from. Intuitively I'd be leaning in the opposite direction -- dropping factor levels automatically is usually a bad thing.
--
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Office: A 4.23
Email: pd.mes at cbs.dk Priv: PDalgd at gmail.com
More information about the R-help
mailing list