[R] Help with factor levels and reference level

David Winsemius dwinsemius at comcast.net
Sat Jun 7 05:35:55 CEST 2014


On Jun 6, 2014, at 11:16 AM, Nwinters wrote:

> I have a variable coded in Stata as follows:
> **
> *gen sat_pm25cat_=.
> replace sat_pm25cat_= 1 if (sat_pm25>=4 & sat_pm25<=7.1 & sat_pm25!=.)
> replace sat_pm25cat_= 2 if (sat_pm25>=7.1 & sat_pm25<=10)
> replace sat_pm25cat_= 3 if (sat_pm25>=10.1 & sat_pm25<=11.3)
> replace sat_pm25cat_= 4 if (sat_pm25>=11.4 & sat_pm25<=12.1)
> replace sat_pm25cat_= 5 if (sat_pm25>=12.2 & sat_pm25<=17.1)

Apparently Stata handles overlapping definitions somehow. (7.1 items would be ambiguously define.)  I suspect you can duplicate that intended effect with:

sat_pm25cat_  <- findInterval(sat_pm25, c(4, 7.1 ,10, 11.4,12.2, 17.1) )


> 
> gen satpm25catR= "A" if sat_pm25cat_==1
> replace satpm25catR= "B" if sat_pm25cat_==2
> replace satpm25catR= "C" if sat_pm25cat_==3
> replace satpm25catR= "D" if sat_pm25cat_==4
> replace satpm25catR= "E" if sat_pm25cat_==5


satpm25catR <- factor( LETTERS[1:5][ sat_pm25cat_ ] )


> ***
> 
> my model for R is:
> ##
> *glm.PM25linB <-glm(leuk ~ satpm25catR + sex + ageR, data=leuk,
> family=binomial, epsilon=1e-15, maxit=1000)*
> ##
> 
> In the summary, satpm25catR is being reported as all levels:
> 
> <http://r.789695.n4.nabble.com/file/n4691823/Screen_Shot_2014-06-06_at_2.png> 
> 
> *What I want is to make "A" the reference level, how do I do this??*

It would be the reference level by default since factors are sorted lexically.

-- 
David Winsemius
Alameda, CA, USA



More information about the R-help mailing list