[R] How to force regression coeffs for some values in a categorical variable

David Winsemius dwinsemius at comcast.net
Sun Nov 29 23:24:27 CET 2009


I worry whether you understand what is happening when you lump all the  
"unwanted levels" into a reference level. Be sure to watch the  
intercept as you compare models. It will be some sort of adjusted mean  
for whatever cases are in the reference levels of that and teh  
reference levels of any other factor. It will change as you add or  
remove levels from that status. Just because you get no coefficient  
does not mean those data points are not affecting the predictions you  
will make from the model. The prediction for cases in those reference  
levels will NOT be 0. Nor will the predicted differences between that  
group and others be zero.

-- 
David.

On Nov 29, 2009, at 4:09 PM, sr danda wrote:

> My model has several independent and categorical variables. I would  
> not like to subset them as other variables in the data are useful. I  
> just wanted to set some coefficients for some levels in a single  
> category.
> A prototype of it can be something like y + constant *  
> (cat.variable1-Level1) ~ x1 + x2 + cat.variable1(if level != level1)  
> + cat.variable2 +....
>
> Currently, I am modifying data by creating new variables for each  
> level and recoding the original values.
>
> I am wondering if there are any other approaches.
>
> Thanks,
> Danda
>
> On Sun, Nov 29, 2009 at 11:48 AM, David Winsemius <dwinsemius at comcast.net 
> > wrote:
>
> On Nov 29, 2009, at 11:23 AM, sr danda wrote:
>
> Hi,
>
> I am a new R user. I am using it develop regression models with  
> categorical
> variables.
> Is there a way to force some regression coefficients to be zero for  
> some of
> the values in a categorical variable (with 12 factor levels)?
>
> I am recoding the values to the default value (1st in the order of  
> dummy's).
> But I am not sure if this is the correct approach if I want to force
> coefficients to be specific values.
>
> It's a bit unclear from your description what you are trying to do  
> (and it might help to hear the justification for doing it). If you  
> do not want the cases with particular factor levels used in the  
> prediction, then subset them out. If you want a group of factor  
> levels grouped and and then used as the reference level, then perhaps:
>
> ?relevel
>
> That will of course result in the intercept term becoming the  
> adjusted mean for those levels, but I'm sure you already knew that.
>
>
>
> Thanks for your help.
>
> Regards,
> Danda
>
> -- 
>
> David Winsemius, MD
> Heritage Laboratories
> West Hartford, CT
>
>

David Winsemius, MD
Heritage Laboratories
West Hartford, CT




More information about the R-help mailing list