[R] how to get "lsmeans"?

Chuck Cleland ccleland at optonline.net
Wed Mar 21 14:28:55 CET 2007


Liaw, Andy wrote:
> I verified the result from the following with output from JMP 6 on the
> same data (don't have SAS: don't need it):
> 
> set.seed(631)
> n <- 100
> dat <- data.frame(y=rnorm(n), A=factor(sample(1:2, n, replace=TRUE)),
>                   B=factor(sample(1:2, n, replace=TRUE)),
>                   C=factor(sample(1:2, n, replace=TRUE)),
>                   d=rnorm(n))
> fm <- lm(y ~ A + B + C + d, dat)
> ## Form a data frame of points to predict: all combinations of the
> ## three factors and the mean of the covariate.
> p <- data.frame(expand.grid(A=1:2, B=1:2, C=1:2))
> p[] <- lapply(p, factor)
> p <- cbind(p, d=mean(dat$d))
> p <- cbind(yhat=predict(fm, p), p)
> ## lsmeans for the three factors:
> with(p, tapply(yhat, A, mean))
> with(p, tapply(yhat, B, mean))
> with(p, tapply(yhat, C, mean))

  Using Andy's example data, these are the LSMEANS and intervals I get
from SAS:

A        y LSMEAN      95% Confidence Limits
1       -0.071847       -0.387507     0.243813
2       -0.029621       -0.342358     0.283117

B        y LSMEAN      95% Confidence Limits
1       -0.104859       -0.397935     0.188216
2        0.003391       -0.333476     0.340258

C        y LSMEAN      95% Confidence Limits
1       -0.084679       -0.392343     0.222986
2       -0.016789       -0.336374     0.302795

  One way of reproducing the LSMEANS and intervals from SAS using
predict() seems to be the following:

> dat.lm <- lm(y ~ A + as.numeric(B) + as.numeric(C) + d, data = dat)
> newdat <- expand.grid(A=factor(c(1,2)),B=1.5,C=1.5,d=mean(dat$d))
> cbind(newdat, predict(dat.lm, newdat, interval="confidence"))
  A   B   C          d         fit        lwr       upr
1 1 1.5 1.5 0.09838595 -0.07184709 -0.3875070 0.2438128
2 2 1.5 1.5 0.09838595 -0.02962086 -0.3423582 0.2831165

  However, another possibility seems to be:

> dat.lm <- lm(y ~ A + as.numeric(B) + as.numeric(C) + d, data = dat)
> newdat <-
expand.grid(A=factor(c(1,2)),B=mean(as.numeric(dat$B)),C=mean(as.numeric(dat$C)),d=mean(dat$d))
> cbind(newdat, predict(dat.lm, newdat, interval="confidence"))
  A    B    C          d         fit        lwr       upr
1 1 1.43 1.48 0.09838595 -0.08078243 -0.3964661 0.2349012
2 2 1.43 1.48 0.09838595 -0.03855619 -0.3479589 0.2708465

  The predictions directly above match what effect() in the effects
package by John Fox returns:

library(effects)

> effect("A", fm, xlevels=list(d = mean(dat$D)))

 A effect
A
          1           2
-0.08078243 -0.03855619

  But for some reason the predict() and effect() intervals are a little
different:

> effect("A", fm, xlevels=list(d = mean(dat$D)))$lower
          [,1]
101 -0.3924451
102 -0.3440179

> effect("A", fm, xlevels=list(d = mean(dat$D)))$upper
         [,1]
101 0.2308802
102 0.2669055

  I would be interested in any comments on these different approaches
and on the difference in intervals returned by predict() and effect().

hope this helps,

Chuck

> Andy 
> 
> From: Xingwang Ye
>> Dear all, 
>>       
>>     I search the mail list about this topic and learn that no 
>> simple way is available to get "lsmeans" in R as in SAS.
>>     Dr.John Fox and Dr.Frank E Harrell have given very useful 
>> information about "lsmeans" topic.    
>>     Dr. Frank E Harrell suggests not to think about lsmeans, 
>> just to think about what predicted values wanted
>>     and to use the predict function. However, after reading 
>> the R help file for a whole day, I am still unclear how to do it.
>>     Could some one give me a hand? 
>>  
>> for example:
>>   
>> A,B and C are binomial variables(factors); d is a continuous 
>> variable ; The response variable Y is  a continuous variable too.  
>>
>> To get lsmeans of Y according to A,B and C, respectively, in 
>> SAS, I tried proc glm data=a;  class A B C;  model Y=A B C d; 
>>  lsmeans A B C/cl; run;  
>>
>> In R, I tried this:  
>>  library(Design)
>>  ddist<-datadist(a)
>>  options(datadist="ddist")
>>  f<-ols(Y~A+B+C+D,data=a,x=TRUE,y=TRUE,se.fit=TRUE)  
>>
>> then how to get the "lsmeans" for A, B, and C, respectively 
>> with predict function?
>>
>>  
>>
>> Best wishes
>> yours, sincerely 
>> Xingwang Ye    
>> PhD candidate     
>> Research Group of Nutrition Related Cancers and Other Chronic 
>> Diseases      
>> Institute for Nutritional Sciences,  
>> Shanghai Institutes of Biological Sciences,     
>> Chinese Academy of Sciences     
>> P.O.Box 32     
>> 294 Taiyuan Road     
>> Shanghai 200031     
>> P.R.CHINA
>>
>>
> 
> 
> ------------------------------------------------------------------------------
> Notice:  This e-mail message, together with any attachments,...{{dropped}}
> 
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

-- 
Chuck Cleland, Ph.D.
NDRI, Inc.
71 West 23rd Street, 8th floor
New York, NY 10010
tel: (212) 845-4495 (Tu, Th)
tel: (732) 512-0171 (M, W, F)
fax: (917) 438-0894



More information about the R-help mailing list