[R] regression function for categorical predictor data

(Ted Harding) Ted.Harding at manchester.ac.uk
Thu Sep 9 00:33:37 CEST 2010


On 08-Sep-10 21:11:27, karena wrote:
> Hi, do you guys know what function in R handles the multiple regression
> on categorical predictor data. i.e, 'lm' is used to handle continuous
> predictor data.
> 
> thanks,
> karena

Karena,
lm() also handles categorical data, provided these are presented
as factors. For example:

set.seed(12345)
X <- 0.05*(-20:20)   # Continuous predictor
F <- as.factor(c(rep("A",21),rep("B",20)))
  ##21 obs at level "A", 20 at level "B"
Y <- 0.5*X + c(0.25*rnorm(21),0.25*rnorm(20)+2.0)
  ## Y increases linearly with X (coeff = 0.5)
  ## Y at Level "B" is 2.0 higher than at Level "A"
  ## "Error" term has SD = 0.25
plot(X,Y)

summary(lm(Y ~ X + F))
# Call: lm(formula = Y ~ X + F)
# Residuals:
#      Min       1Q   Median       3Q      Max 
# -0.56511 -0.15807 -0.00034  0.16484  0.44048 
# Coefficients:
#             Estimate Std. Error t value Pr(>|t|)    
# (Intercept)  0.09561    0.08869   1.078    0.288    
# X            0.63621    0.13671   4.654 3.89e-05 ***
# FB           1.93821    0.16181  11.978 1.80e-14 ***
# ---
# Signif. codes:  0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1 
# Residual standard error: 0.2589 on 38 degrees of freedom
# Multiple R-squared: 0.965,      Adjusted R-squared: 0.9631 
# F-statistic: 523.4 on 2 and 38 DF,  p-value: < 2.2e-16 

The reported Estimate FB give the change in level resulting
from a change from "A" to "B" in F.

Hoping this helps,
Ted.

--------------------------------------------------------------------
E-Mail: (Ted Harding) <Ted.Harding at manchester.ac.uk>
Fax-to-email: +44 (0)870 094 0861
Date: 08-Sep-10                                       Time: 23:33:34
------------------------------ XFMail ------------------------------



More information about the R-help mailing list