[R] repeated measure with quantitative independent variable

Fox, John jfox at mcmaster.ca
Mon Dec 14 20:37:39 CET 2015


Dear Cristiano,

> -----Original Message-----
> From: Cristiano Alessandro [mailto:cri.alessandro at gmail.com]
> Sent: Monday, December 14, 2015 2:11 PM
> To: Fox, John
> Cc: r-help at r-project.org
> Subject: Re: [R] repeated measure with quantitative independent variable
> 
> Dear John,
> 
> thanks for your reply! The reason why I did not want to factorize the
> within-subjects variable was to avoid increasing the Df of the model
> from 1 (continuous variable) to k-1 (where k is the number of levels of
> the factors). I am now confused, because you have factorized the
> variable (indeed using "factor"), but the Df of myfactor_nc seems to be
> 1. Could you explain that?

I defined only one (linear) contrast for the factor -- but I made a mistake in how I did it, see below.

> 
> Comparing the results obtained with the two methods I seem to get
> completely different results:
> 
> * aov()*
> 
> dv <- c(1,3,4,2,2,3,2,5,6,3,4,4,3,5,6);
> subject <-
> factor(c("s1","s1","s1","s2","s2","s2","s3","s3","s3","s4","s4","s4","s5
> ","s5","s5"));
> myfactor_nc <- c(1,2,3,1,2,3,1,2,3,1,2,3,1,2,3)
> mydata_nc <- data.frame(dv, subject, myfactor_nc)
> 
> am1_nc <- aov(dv ~ myfactor_nc + Error(subject/myfactor_nc),
> data=mydata_nc)
> summary(am1_nc)
> 
> Error: subject
>            Df Sum Sq Mean Sq F value Pr(>F)
> Residuals  4   12.4     3.1
> 
> Error: subject:myfactor_nc
>              Df Sum Sq Mean Sq F value Pr(>F)
> myfactor_nc  1   14.4    14.4      16 0.0161 *
> Residuals    4    3.6     0.9
> ---
> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
> 
> Error: Within
>            Df Sum Sq Mean Sq F value Pr(>F)
> Residuals  5  1.333  0.2667
> 
> 
> *Anova()*
> 
> dvm <- with(mydata_nc, cbind(dv[myfactor_nc==1],dv[myfactor_nc==2],
> dv[myfactor_nc==3]))
> 
> mlm1 <- lm(dvm ~ 1)
> myfactor_nc <- factor(1:3)
> contrasts(myfactor_nc) <- matrix(-1:1, ncol=1)
> idata <- data.frame(myfactor_nc)
> Anova(mlm1, idata=idata, idesign=~myfactor_nc)
> Note: model has only an intercept; equivalent type-III tests
> substituted.
> 
> Type III Repeated Measures MANOVA Tests: Pillai test statistic
>              Df test stat approx F num Df den Df   Pr(>F)
> (Intercept)  1   0.93790   60.409      1      4 0.001477 **
> myfactor_nc  1   0.83478    7.579      2      3 0.067156 .
> ---
> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
> 
> Why is that?

There are two differences: (1) The repeated-measures analysis reported by default by Anova() is a MANOVA; if you want the univariate analysis comparable to aov(), you have to ask for it. (2) As I mentioned, I made a mistake in defining the contrasts for the factor; what I did implicitly included the quadratic component, which wasn't my intention. 

Here's the correct version, which, as you can see, produces the same result as aov():

> contrasts(myfactor_nc, 1) <- matrix(-1:1, ncol=1)
> contrasts(myfactor_nc)
  [,1]
1   -1
2    0
3    1

> idata <- data.frame(myfactor_nc)

> summary(Anova(mlm1, idata=idata, idesign=~myfactor_nc), multivariate=FALSE)
Note: model has only an intercept; equivalent type-III tests substituted.

Univariate Type III Repeated-Measures ANOVA Assuming Sphericity

                SS num Df Error SS den Df      F   Pr(>F)   
(Intercept) 187.27      1     12.4      4 60.409 0.001477 **
myfactor_nc  14.40      1      3.6      4 16.000 0.016130 * 
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

> summary(am1_nc) # your result using aov()

Error: subject
          Df Sum Sq Mean Sq F value Pr(>F)
Residuals  4   12.4     3.1               

Error: subject:myfactor_nc
            Df Sum Sq Mean Sq F value Pr(>F)  
myfactor_nc  1   14.4    14.4      16 0.0161 *
Residuals    4    3.6     0.9                 
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Error: Within
          Df Sum Sq Mean Sq F value Pr(>F)
Residuals  5  1.333  0.2667     

My apologies for the confusion about the contrasts.

John
          
> 
> Thanks a lot
> Cristiano
> 
> 
> On 12/14/2015 05:25 PM, Fox, John wrote:
> > Dear Cristiano,
> >
> > If I understand correctly what you want to do, you should be able to
> use Anova() in the car package (your second question) by treating your
> numeric repeated-measures predictor as a factor and defining a single
> linear contrast for it.
> >
> > Continuing with your toy example:
> >
> >> myfactor_nc <- factor(1:3)
> >> contrasts(myfactor_nc) <- matrix(-1:1, ncol=1)
> >> idata <- data.frame(myfactor_nc)
> >> Anova(mlm1, idata=idata, idesign=~myfactor_nc)
> > Note: model has only an intercept; equivalent type-III tests
> substituted.
> >
> > Type III Repeated Measures MANOVA Tests: Pillai test statistic
> >              Df test stat approx F num Df den Df   Pr(>F)
> > (Intercept)  1   0.93790   60.409      1      4 0.001477 **
> > myfactor_nc  1   0.83478    7.579      2      3 0.067156 .
> > ---
> > Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
> >
> > With just 3 distinct levels, however, you could just make myfactor_nc
> an ordered factor, not defining the contrasts explicitly, and then you'd
> get both linear and quadratic contrasts.
> >
> > I hope this helps,
> >   John
> >
> > -----------------------------------------------
> > John Fox, Professor
> > McMaster University
> > Hamilton, Ontario, Canada
> > http://socserv.socsci.mcmaster.ca/jfox/
> >
> >
> >
> >> -----Original Message-----
> >> From: R-help [mailto:r-help-bounces at r-project.org] On Behalf Of
> >> Cristiano Alessandro
> >> Sent: Monday, December 14, 2015 8:43 AM
> >> To: r-help at r-project.org
> >> Subject: [R] repeated measure with quantitative independent variable
> >>
> >> Hi all,
> >>
> >> I am new to R, and I am trying to set up a repeated measure analysis
> >> with a quantitative (as opposed to factorized/categorical)
> >> within-subjects variable. For a variety of reasons I am not using
> >> linear-mixed models, rather I am trying to fit a General Linear Model
> (I
> >> am aware of assumptions and limitations) to assess whether the value
> of
> >> the within-subjects variable affects statistically significantly the
> >> response variable. I have two questions. To make myself clear I
> propose
> >> the following exemplary dataset (where myfactor_nc is the
> quantitative
> >> within-subjects variable; i.e. each subject performs the experiment
> >> three times -- nc_factor=1,2,3 -- and produces the response in
> variable
> >> dv).
> >>
> >> dv <- c(1,3,4,2,2,3,2,5,6,3,4,4,3,5,6);
> >> subject <-
> >>
> factor(c("s1","s1","s1","s2","s2","s2","s3","s3","s3","s4","s4","s4","s5
> >> ","s5","s5"));
> >> myfactor_nc <- c(1,2,3,1,2,3,1,2,3,1,2,3,1,2,3)
> >> mydata_nc <- data.frame(dv, subject, myfactor_nc)
> >>
> >> *Question 1 (using function aov)*
> >>
> >> Easily done...
> >>
> >> am1_nc <- aov(dv ~ myfactor_nc + Error(subject/myfactor_nc),
> >> data=mydata_nc)
> >> summary(am1_nc)
> >>
> >> Unlike the case when myfactor_nc is categorical, this produces three
> >> error strata: Error: subject, Error: subject:myfactor_nc, Error:
> Within.
> >> I cannot understand the meaning of the latter. How is that computed?
> >>
> >> *Question 2 (using function lm)*
> >>
> >> Now I would like to do the same with the functions lm() and Anova()
> >> (from the car package). What I have done so far (please correct me if
> I
> >> am mistaking) is the following:
> >>
> >> # Unstack the dataset
> >> dvm <- with(mydata_nc, cbind(dv[myfactor_nc==1],dv[myfactor_nc==2],
> >> dv[myfactor_nc==3]))
> >>
> >> #Fit the linear model
> >> mlm1 <- lm(dvm ~ 1)
> >>
> >> (is that model above correct for my design?)
> >>
> >> Now I should use the Anova function, but it seems that it only
> accepts
> >> factors, and not quantitative within-subject variable.
> >>
> >> Any help is highly appreciated!
> >>
> >> Thanks
> >> Cristiano
> >>
> >> ______________________________________________
> >> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> >> https://stat.ethz.ch/mailman/listinfo/r-help
> >> PLEASE do read the posting guide http://www.R-project.org/posting-
> >> guide.html
> >> and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list