[R] A question on modeling brain growth using GAM

Thu Apr 6 20:16:09 CEST 2017

Hi, Simon

Thank you for your explanation! I followed the instructions and
successfully get the predicted values with both fixed and random effects
incorporated: pred.new=predict.gam(gamm1$gam,newdata,type="response").

Also, what I meant to say was "plot(gamm1$gam, pages=1)" for left and right
figures. I didn't attach any figures.

Thank you very much for the help!
L

On Thu, Apr 6, 2017 at 10:44 AM, Simon Wood <simon.wood at bath.edu> wrote:

>
> gamObj=gam(brainVolume~ s(correctedAge) +  s(subjIndexF, bs="re") +
> s(subjIndexF, correctedAge, bs="re"), method="REML", data=mydata), where
> subjIndexF is a factor for each subject. I was thrown an error saying "more
> coefficients than data".
>
> --- I'm not sure exactly  how many scans and subjects you have. The above
> model will have 10 + 2*(number of subjects ) coeffs. If that is more than
> the number of scans then gam will not handle it. Depending on the numbers
> involved you could reduce the k parameter to s(correctedAge), to fix the
> problem. (e.g. with 31 subjects and 70 scans s(correctedAge=8) should work
> ).
>
> However, when I tried to model similar (please correct me if they are not
> similar) things using GAMM based on description in
> ?factor.smooth.interaction:
>  gamm1=gamm(BrainVolume~ s(correctedAge) + s(correctedAge, subjIndexF,
> bs="fs", k=5), data=mydata)
>
> --- It's not the same model. You now have a random smooth curve per
> subject. You can add random effects in gamm using the list form of the
> syntax for specifying random effects in lme. see ?gamm. Random intercepts
> and slopes can be added that way.
>
> The model ran.  When I plotted the data using plot(gamm1), I got two
> figures: the left one is the group mean and 95%CI, which I assume is the
> results by gamm1$gam model. The right one shows 30 lines (the number of
> subjects in my data) fluctuating around 0, which I assume is the random
> effects (gamm1$lme) modeled within each subject that can be added onto the
> group mean for individual curves. Is my understanding correct? If so, how
> can I extract these curves from gamm1$lme?
>
> --- I would extract the fitted curves using predict(gamm1$gam,...,
> type="terms") supplying the factor levels and correctedAges at which you
> want to evaluate the curves.
>
> Many thanks!
> L
>
>
>
> On Thu, Apr 6, 2017 at 8:22 AM, Simon Wood <simon.wood at bath.edu> wrote:
>
>> If 'subjIndexF' is a factor for subject, then s(subjIndexF, bs="re") will
>> produce a random effect for subject. i.e. each subject will be given its
>> own random intercept term, which is a way that repeated measures data like
>> this are often handled.
>>
>> The reason for the s(subjIndexF, bs="re") syntax is that smooths can be
>> viewed as Gaussian random effects, so simple Gaussian random effects can
>> also be viewed as (0-dimensional) smooths. In general s(x,z,w,bs="re") just
>> appends the columns of model.matrix(~x:z:w-1) to the gam model matrix, and
>> treats the associated coefficients as i.i.d. Gaussian random effects with a
>> common variance (to be estimated). In principle this works with any number
>> of arguments to s(...,bs="re").
>>
>> See ?random.effects (and its linked help files) in mgcv for more.
>>
>> There are mechanisms for allowing random smooth curves for each subject,
>> (e.g. ?factor.smooth.interaction), but I would only use these if simpler
>> approaches really aren't adequate here.
>>
>> best,
>> Simon
>>
>>
>>
>>
>> On 30/03/17 17:06, David Winsemius wrote:
>>
>>> On Mar 30, 2017, at 6:56 AM, Leon Lee <bhamlion78 at gmail.com> wrote:
>>>>
>>>> David
>>>>
>>>> Thank you for your reply. I apologize if I posted in the wrong forum,
>>>> as I really couldn't decide which forum is the best place for my question
>>>> and I saw similar questions asked before in this forum.
>>>>
>>>> I agree that a sample of ~30 subjects (70 scans in total), the model
>>>> can be too complicated. Based on that, I did the following:
>>>> (1) ignored the gender effect, as we have less females than males.
>>>> (2) corrected chronological age based on their gestational age, that
>>>> is, we subtracted an infant's chronological age by 2 weeks, if the infant's
>>>> gestational age is 38 weeks instead of 40weeks.
>>>>
>>>> When I ran the model with corrected age, gestational age and their
>>>> interactions modeled, I found the main effect of gestational age and the
>>>> interaction between the two are gone.
>>>>
>>>> So, my final model will look something like this:
>>>> gamObj=gam(brainVolume~ s(correctedAge) +  s(subjIndexF, bs="re"),
>>>> method="REML", data=mydata)
>>>>
>>>> Does this look more reasonable?
>>>>
>>> I'm still having difficulty understand how a "smoothing" function would
>>> be used to handle repeated measures without some sort of "group-within"
>>> indicator.
>>>
>>> I would have imagined (and this is because I have no experience with
>>> using this package for repeated measures) something along the lines of:
>>>
>>>   ...+s(correctedAg|subjIndexF)
>>>
>>> I see this statement in the docs:
>>>
>>>
>>> smooth.construct.re.smooth.spec {mgcv}
>>>
>>> "gam can deal with simple independent random effects, by exploiting the
>>> link between smooths and random effects to treat random effects as smooths.
>>> s(x,bs="re") implements this."
>>>
>>> But I don't see that as applying to the dependency between individuals
>>> measured repeatedly. I find no examples of repeated measures problems being
>>> solve by gam(). There is also a note on the same page:
>>>
>>> "Note that smooth ids are not supported for random effect terms. Unlike
>>> most smooth terms, side conditions are never applied to random effect terms
>>> in the event of nesting (since they are identifiable without side
>>> conditions)."
>>>
>>> When I do a search on "using gam mgcv formula mixed effects" I am
>>> referred to packages 'gamm' and 'gamm4' produced by the same author (Simon
>>> Wood) as pkg 'mgcv', or to package `nlme`.
>>>
>>>
>>> Yes, I am relatively new to the mixed model. We originally applied
>>>> functional data analysis (PACE) on the data, but want to see the results
>>>> using a different approach. Also, I couldn't find the Mixed Models list, do
>>>> you mind sending me a link?
>>>>
>>> This is a link to the main mailing lists page:
>>>
>>> https://www.r-project.org/mail.html
>>>
>>> Found with a search on Google with "R mailing lists"
>>>
>>>
>>> Thank you!
>>>> Longchuan
>>>>
>>>>
>>>> On Tue, Mar 28, 2017 at 4:28 PM, David Winsemius <
>>>> dwinsemius at comcast.net> wrote:
>>>>
>>>> On Mar 28, 2017, at 9:32 AM, Leon Lee <bhamlion78 at gmail.com> wrote:
>>>>>
>>>>> Hi, R experts
>>>>>
>>>>> I am new to R & GAM toolbox and would like to get inputs from you all
>>>>> on my
>>>>> models. The question I have is as follows:
>>>>> I have 30 subjects with each subject being scanned from one to three
>>>>> times
>>>>> in the first year of life. The brain volume from each scan was
>>>>> measured.
>>>>> The scan time was randomly distributed from birth to 1 year.
>>>>> Each subject has different gestational age ranging from 38 to 41 weeks
>>>>> Each subject has chronological age from birth to 1 year old
>>>>> Each subject has gender category.
>>>>> Now, I want to look at how predictors, such as subject's chronological
>>>>> age,
>>>>> gestational age and gender will explain the changes in brain volume. I
>>>>> also
>>>>> want to include interactions between gender and age, gestational and
>>>>> chronological age. Random effects are also included in the model to
>>>>> account
>>>>> for subject variability. My model looks like the follows:
>>>>>
>>>>> gam=gam(brainVolume~ s(age) + ti(age, gestationalAge) + gestationalAge
>>>>> +
>>>>> sex + s(age, by=sex) +  s(subjIndexF, bs="re"), method="REML",
>>>>> data=mydata)
>>>>>
>>>>> Are there any obvious mistakes in the model? Any suggestions will be
>>>>> greatly appreciated!
>>>>>
>>>> I'm not seeing mistakes in the syntax but I would question whether 30
>>>> subjects is sufficient to adequately support estimates in a a model of this
>>>> complexity. I would also think that the 's(age)' and 'sex' terms would get
>>>> aliased out in a model with "+ s(age, by=sex)". Most R regression functions
>>>> handle removal of over-parametrization automatically.
>>>>
>>>> You also have a variable number of measurements per subject. I am
>>>> unable to comment on the effort to account for the implicit and variably
>>>> measured correlation and auto-correlation of values within subjects using a
>>>> "smooth" on subjIndexF, since that is not an approach I was familiar with.
>>>> But I am getting concerned whether you are also new to statistical modeling
>>>> in addition to your use of R and GAM being "new to you"?
>>>>
>>>> (Perhaps Simon or one of the mixed-effects experts can correct the gaps
>>>> in my understanding of how to model repeated measures in the context of
>>>> small numbers of subjects and irregular emasurements.)
>>>>
>>>> Please read the Posting Guide and the pages of candidate mailing lists.
>>>> Rhelp is not really the place to go when you need statistical advice. I'm
>>>> not sure if this is really in the center of concerns that get discussed on
>>>> the Mixed Models list, but to my eyes it would be a better fit there.
>>>>
>>>> --
>>>> David.
>>>>
>>>>> L
>>>>>
>>>>>        [[alternative HTML version deleted]]
>>>>>
>>>>> ______________________________________________
>>>>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
>>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>>> PLEASE do read the posting guide http://www.R-project.org/posti
>>>>> ng-guide.html
>>>>> and provide commented, minimal, self-contained, reproducible code.
>>>>>
>>>> David Winsemius
>>>> Alameda, CA, USA
>>>>
>>>>
>>>> David Winsemius
>>> Alameda, CA, USA
>>>
>>> ______________________________________________
>>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide http://www.R-project.org/posti
>>> ng-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>>
>>
>> --
>> Simon Wood, School of Mathematics, University of Bristol BS8 1TW UK
>> +44 (0)117 33 18273 <%2B44%20%280%29117%2033%2018273>
>> http://www.maths.bris.ac.uk/~sw15190
>>
>>
>
>
> --
> Simon Wood, School of Mathematics, University of Bristol BS8 1TW UK
> +44 (0)117 33 18273     http://www.maths.bris.ac.uk/~sw15190
>
>

	[[alternative HTML version deleted]]