[R] Random intercept model with time-dependent covariates, results different from SAS
Prof Brian Ripley
ripley at stats.ox.ac.uk
Sun Jul 4 10:13:57 CEST 2004
Looking at the significance of a main effect (group) in the presence of an
interaction (time:group) is hard to interpret, and in your case is I think
not even interesting. (The `main effect' probably represents difference
in intercept for the time effect, that is the group difference at the last
time. But see the next para.) Note that the two systems are returning
different denominator dfs.
At this point you have not told us enough. My guess is that you have
complete balance with the same number of subjects in each group. In that
case the `group' effect is in the between-subjects stratum (as defined for
the use of Error in aov, which you could also do), and thus R's 11 df
would be right (rather than 44, without W and Z). Without balance Type
III tests get much harder to interpret and the `group' effect would appear
in two strata and there is no simple F test in the classical theory. So
further guessing, SAS may have failed to detect balance and so used the
wrong test.
The time-dependent covariates muddy the issue more, and I looked mainly at
the analyses without them. Again, a crucial fact is not here: do the
covariates depend on the subjects as well?
The good news is that the results _are_ similar. You do have different
time behaviour in the two groups. So stop worrying about tests of
uninteresting hypotheses and concentrate of summarizing that difference.
On Sun, 4 Jul 2004, Keith Wong wrote:
> Dear list-members
>
> I am new to R and a statistics beginner. I really like the ease with which I can
> extract and manipulate data in R, and would like to use it primarily. I've
> been learning by checking analyses that have already been run in SAS.
That presumes SAS gets them `right'. A body of experienced statisticians
thinks that `Type III' tests are somewhere between `often misleading' and
`evil'. It certainly presents tests that are uninteresting without you
asking for them. It reminds me of the days of yore when we taught using
Minitab and had to explain that the Durbin-Watson test was not relevant to
almost all regressions the students would see, yet appeared on every piece
of output.
The standard output from the lme fit (which you have not shown us) would
be more revealing. I don't think I have ever used anova() on a single lme
fit (I would look at the estimates and use anova on two fits, which is
full (RE)ML likelihood ratio not a Wald approximation).
May I suggest a better way to learn is to understand published examples
using lme, in Pinheiro & Bates (2000) or MASS4 for example.
> In an experiment with Y being a response variable, and group a 2-level
> between-subject factor, and time a 5-level within-subject factor. 2
> time-dependent covariates are also measured (continuous variables W and
> Z). The subject id variable (ID) is unique to each subject, and is not
> duplicated across groups.
>
> I tried to fit a random intercept model in R (after setting options(contrasts =
> c(factor = "contr.SAS", ordered = "contr.poly")) as recommended on this list),
> and making the time, group and id variables factors:
>
> > g2 = lme(Y ~ time + group + time:group + W + Z, random = ~ 1 | id, data =
> datamod)
>
> > anova(g2)
> numDF denDF F-value p-value
> (Intercept) 1 42 5.54545 0.0233
> time 4 42 16.41069 <.0001
> group 1 11 0.83186 0.3813
> W 1 42 0.07555 0.7848
> Z 1 42 45.23577 <.0001
> time:group 4 42 3.04313 0.0273
>
>
> I compared the results using SAS proc mixed:
> proc mixed data = datamod;
> class id time group;
> model Y = time group time*group W Z /s;
> random int / sub = id;
> run;
>
> And get the following anova table for the fixed effects:
>
> Type 3 Tests of Fixed Effects
>
> Num Den
> Effect DF DF F Value Pr > F
>
> time 4 42 2.55 0.0534
> group 1 42 0.54 0.4664
> time*group 4 42 3.04 0.0273
> W 1 42 8.80 0.0050
> Z 1 42 32.52 <.0001
>
> I am perplexed to see that the test for the main effect "time" is quite
> different. Both models seem to be specified equivalently to me, am I doing
> something wrong - particularly with the inclusion of the time-dependent
> covariates W and Z? I have looked at the data in both programmes and they are
> the same. There are no missing observations.
>
> I a simpler model without the time-dependent covariates, and in this case the
> results are similar:
> [R]
> > g1 = lme(Y ~ time + group + time:group, random = ~ 1 | id, data = datamod)
> > anova(g1)
> numDF denDF F-value p-value
> (Intercept) 1 44 3.387117 0.0725
> time 4 44 10.620547 <.0001
> group 1 11 0.508092 0.4908
> time:group 4 44 3.961726 0.0079
>
>
> [SAS]
> proc mixed data = datamod;
> class id time group;
> model Y = time group time*group /s;
> random int / sub = id;
> run;
>
> Type 3 Tests of Fixed Effects
>
> Num Den
> Effect DF DF F Value Pr > F
>
> time 4 44 7.75 <.0001
> group 1 44 0.51 0.4797
> time*group 4 44 3.96 0.0079
>
> Secondly, is there no R equivalent of the "LSMEANS" statement in SAS? Is there a
> work-around?
>
> I would very much appreciate assistance.
>
> Thanks. Keith
>
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://www.stat.math.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
>
>
--
Brian D. Ripley, ripley at stats.ox.ac.uk
Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel: +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UK Fax: +44 1865 272595
More information about the R-help
mailing list