[R] glmmLasso with interactions errors

Sat Jul 16 21:22:41 CEST 2016

> On Jul 16, 2016, at 11:26 AM, Walker Pedersen <wsp at uwm.edu> wrote:
> 
> Hi,
> 
> Thanks for the response.
> 
> The warnings and errors can be reproduced with the data and code I
> included in my first mailing list post. I will provide the full output
> at the end of this post.
> 

I get:

> glm2 <- glmmLasso(Activity~as.factor(Novelty) + as.factor(Valence) +
+ STAIt + as.factor(ROI)
+ + as.factor(Novelty):as.factor(Valence):as.factor(ROI),
+ list(Subject=~1), data = Nov7T, lambda=10)
Error in is.data.frame(data) : object 'Nov7T' not found

If I instead run with KNov rather than the missing Nov7T object and use the `interaction` function to build a three-way interaction, I get:

glm2 <- glmmLasso(Activity~as.factor(Novelty) + as.factor(Valence) +
STAIt + as.factor(ROI) + interaction(Novelty,Valence,ROI),
list(Subject=~1), data = KNov, lambda=10)
summary(glm2)
Call:
glmmLasso(fix = Activity ~ as.factor(Novelty) + as.factor(Valence) + 
    STAIt + as.factor(ROI) + interaction(Novelty, Valence, ROI), 
    rnd = list(Subject = ~1), data = KNov, lambda = 10)

Fixed Effects:

Coefficients:
                                           Estimate StdErr z.value p.value
(Intercept)                              1.4584e-01     NA      NA      NA
as.factor(Novelty)R                     -6.3017e-02     NA      NA      NA
as.factor(Valence)N                     -3.8093e-02     NA      NA      NA
STAIt                                   -1.7146e-03     NA      NA      NA
as.factor(ROI)B                         -1.3502e-02     NA      NA      NA
as.factor(ROI)H                          1.1962e-03     NA      NA      NA
interaction(Novelty, Valence, ROI)R.E.A  0.0000e+00     NA      NA      NA
interaction(Novelty, Valence, ROI)N.N.A -1.9828e-02     NA      NA      NA
interaction(Novelty, Valence, ROI)R.N.A  2.5937e-19     NA      NA      NA
interaction(Novelty, Valence, ROI)N.E.B  0.0000e+00     NA      NA      NA
interaction(Novelty, Valence, ROI)R.E.B  0.0000e+00     NA      NA      NA
interaction(Novelty, Valence, ROI)N.N.B  0.0000e+00     NA      NA      NA
interaction(Novelty, Valence, ROI)R.N.B  0.0000e+00     NA      NA      NA
interaction(Novelty, Valence, ROI)N.E.H  0.0000e+00     NA      NA      NA
interaction(Novelty, Valence, ROI)R.E.H -2.1495e-02     NA      NA      NA
interaction(Novelty, Valence, ROI)N.N.H  0.0000e+00     NA      NA      NA
interaction(Novelty, Valence, ROI)R.N.H  0.0000e+00     NA      NA      NA

Random Effects:

StdDev:
          Subject
Subject 0.0644229

It does appear that the author has tried to discourage using formula-mediated interactions, since with a continuous-by-factor interaction I get this error message:

Error in est.glmmLasso.RE(fix = fix, rnd = rnd, data = data, lambda = lambda,  : 
  Usage of '*' not allowed in formula! Please specify the corresponding variables separately.

Neither of the first two examples on the glmmLasso page returns standard errors either. Perhaps you should correspond with the package author. Are you familiar with the maintainer function?

David

> By sketchy, I mean having a higher likelihood of resulting in
> overfitting.  By more straightforward, I mean having a less steep
> learning curve for implementation.
> 
>       Thanks for your help!
> 
> 
>> KNov <- read.table("Novelty_abr.txt", header = TRUE)
>> KNov$Subject <- factor(KNov$Subject)
>> glm1 <- glmmLasso(Activity~as.factor(Novelty) + as.factor(Valence) + STAIt + as.factor(ROI)
> + + as.factor(Valence):as.factor(ROI), list(Subject=~1), data = KNov, lambda=10)
>> summary(glm1)
> Call:
> glmmLasso(fix = Activity ~ as.factor(Novelty) + as.factor(Valence) +
>    STAIt + as.factor(ROI) + as.factor(Valence):as.factor(ROI),
>    rnd = list(Subject = ~1), data = KNov, lambda = 10)
> 
> 
> Fixed Effects:
> 
> Coefficients:
>                                       Estimate StdErr z.value p.value
> (Intercept)                          0.14047593     NA      NA      NA
> as.factor(Novelty)R                 -0.06333466     NA      NA      NA
> as.factor(Valence)N                 -0.03537854     NA      NA      NA
> STAIt                               -0.00173351     NA      NA      NA
> as.factor(ROI)B                     -0.00438142     NA      NA      NA
> as.factor(ROI)H                      0.00016285     NA      NA      NA
> as.factor(Valence)N:as.factor(ROI)B -0.00739870     NA      NA      NA
> as.factor(Valence)N:as.factor(ROI)H  0.00000000     NA      NA      NA
> 
> Random Effects:
> 
> StdDev:
>           Subject
> Subject 0.05186835
>> glm2 <- glmmLasso(Activity~as.factor(Novelty) + as.factor(Valence) + STAIt + as.factor(ROI)
> + + as.factor(Novelty):as.factor(Valence):as.factor(ROI),
> list(Subject=~1), data = Nov7T, lambda=10)
> Warning messages:
> 1: In split.default((1:ncol(X))[-inotpen.which], ipen) :
>  data length is not a multiple of split variable
> 2: In lambda_vec * sqrt(block2) :
>  longer object length is not a multiple of shorter object length
> 3: In lambda_vec * sqrt(block2) :
>  longer object length is not a multiple of shorter object length
> 4: In lambda_vec * sqrt(block2) :
>  longer object length is not a multiple of shorter object length
> 5: In lambda_vec * sqrt(block2) :
>  longer object length is not a multiple of shorter object length
> 6: In lambda_vec * sqrt(block2) :
>  longer object length is not a multiple of shorter object length
> 7: In lambda_vec * sqrt(block2) :
>  longer object length is not a multiple of shorter object length
> 8: In lambda_vec * sqrt(block2) :
>  longer object length is not a multiple of shorter object length
> 9: In lambda_vec * sqrt(block2) :
>  longer object length is not a multiple of shorter object length
>> summary(glm2)
> Call:
> glmmLasso(fix = Activity ~ as.factor(Novelty) + as.factor(Valence) +
>    STAIt + as.factor(ROI) +
> as.factor(Novelty):as.factor(Valence):as.factor(ROI),
>    rnd = list(Subject = ~1), data = Nov7T, lambda = 10)
> 
> 
> Fixed Effects:
> 
> Coefficients:
>                                                             Estimate
> StdErr z.value p.value
> (Intercept)                                                -0.0562165
>   NA      NA      NA
> as.factor(Novelty)R                                        -0.0218362
>   NA      NA      NA
> as.factor(Valence)N                                        -0.0067723
>   NA      NA      NA
> STAIt                                                       0.0028832
>   NA      NA      NA
> as.factor(ROI)BNST                                         -0.0457882
>   NA      NA      NA
> as.factor(ROI)Hip                                          -0.0430477
>   NA      NA      NA
> as.factor(Novelty)N:as.factor(Valence)E:as.factor(ROI)Amy   0.0000000
>   NA      NA      NA
> as.factor(Novelty)R:as.factor(Valence)E:as.factor(ROI)Amy   0.0000000
>   NA      NA      NA
> as.factor(Novelty)N:as.factor(Valence)N:as.factor(ROI)Amy   0.0164788
>   NA      NA      NA
> as.factor(Novelty)R:as.factor(Valence)N:as.factor(ROI)Amy   0.0067723
>   NA      NA      NA
> as.factor(Novelty)N:as.factor(Valence)E:as.factor(ROI)BNST  0.0000000
>   NA      NA      NA
> as.factor(Novelty)R:as.factor(Valence)E:as.factor(ROI)BNST  0.0000000
>   NA      NA      NA
> as.factor(Novelty)N:as.factor(Valence)N:as.factor(ROI)BNST  0.0000000
>   NA      NA      NA
> as.factor(Novelty)R:as.factor(Valence)N:as.factor(ROI)BNST  0.0000000
>   NA      NA      NA
> as.factor(Novelty)N:as.factor(Valence)E:as.factor(ROI)Hip   0.0000000
>   NA      NA      NA
> as.factor(Novelty)R:as.factor(Valence)E:as.factor(ROI)Hip   0.0000000
>   NA      NA      NA
> as.factor(Novelty)N:as.factor(Valence)N:as.factor(ROI)Hip   0.0338616
>   NA      NA      NA
> as.factor(Novelty)R:as.factor(Valence)N:as.factor(ROI)Hip   0.0000000
>   NA      NA      NA
> 
> Random Effects:
> 
> StdDev:
>           Subject
> Subject 0.09132963
>> glm3 <- glmmLasso(Activity~as.factor(Novelty) + as.factor(Valence) + STAIt + as.factor(ROI)
> + + as.factor(Valence):as.factor(ROI) + as.factor(Novelty):STAIt,
> list(Subject=~1), data = Nov7T, lambda=10)
> Error in rep(control$index[i], length.fac) : invalid 'times' argument
>> summary(glm3)
> Error in summary(glm3) : object 'glm3' not found
> 
> On Sat, Jul 16, 2016 at 12:51 PM, David Winsemius
> <dwinsemius at comcast.net> wrote:
>> 
>>> On Jul 16, 2016, at 9:29 AM, Walker Pedersen <wsp at uwm.edu> wrote:
>>> 
>>> Thank you for the input Brian and Ben.
>>> 
>>> It is odd how it seems to handle a two way interaction fine (as long
>>> as the continuous variable is not in the mix), but not a 3-way.
>> 
>> You should post code and data to demonstrate what is "not being handled".
>>> 
>>> In any case would anyone be able to give me a rundown of how I would
>>> create a matrix/dummy variable for these interactions to input into
>>> glmmLASSO?
>> 
>> Your first question on this dataset June 17 to CrossValidated.com was closed because no reproducible example was offered. You then posted two further questions on StackOverflow and got guesses as to the solutions  because you again posted no reproducible examples. One of those questions was given in this thread as a possible solution. IN the otehr one you did post some output that gave clues as to the arrangement of your data and suggested that the categorical data was relatively sparse:
>> 
>> http://stackoverflow.com/questions/38132830/getting-p-values-for-all-included-parameters-using-glmmlasso
>> 
>> Now you are getting advice that is similarly just speculation due to lack of code,  data and output. You are unlikely to get further advice that addresses what ever problems you have vaguely described unless you post examples of code that is failing along with either a) the real data or b) R code that creates a simulation with covariate features resembling your data.
>>> 
>>> Alternatively, is there a method for paring down a model that is a bit
>>> less sketchy than simple backfitting, that you would expect to be more
>>> straight forward software-wise?
>> 
>> That appears incredibly vague. Exactly what is "sketchy"? And what would be "more straightforward"?
>> 
>> --
>> David.
>> 
>> 
>>> Thanks!
>>> 
>>> Walker
>>> 
>>> UW-MKE
>>> 
>>> On Thu, Jul 14, 2016 at 10:08 AM, Cade, Brian <cadeb at usgs.gov> wrote:
>>>> It has never been obvious to me that the lasso approach can handle
>>>> interactions among predictor variables well at all.  I'ld be curious to see
>>>> what others think and what you learn.
>>>> 
>>>> Brian
>>>> 
>>>> Brian S. Cade, PhD
>>>> 
>>>> U. S. Geological Survey
>>>> Fort Collins Science Center
>>>> 2150 Centre Ave., Bldg. C
>>>> Fort Collins, CO  80526-8818
>>>> 
>>>> email:  cadeb at usgs.gov
>>>> tel:  970 226-9326
>>>> 
>>>> 
>>>> On Wed, Jul 13, 2016 at 2:20 PM, Walker Pedersen <wsp at uwm.edu> wrote:
>>>>> 
>>>>> Hi Everyone,
>>>>> 
>>>>> I am having trouble running glmmLasso.
>>>>> 
>>>>> An abbreviated version of my dataset is here:
>>>>> 
>>>>> https://drive.google.com/open?id=0B_LliPDGUoZbVVFQS2VOV3hGN3c
>>>>> 
>>>>> Activity is a measure of brain activity, Novelty and Valence are
>>>>> categorical variables coding the type of stimulus used to elicit the
>>>>> response, ROI is a categorical variable coding three regions of the
>>>>> brain that we have sampled this activity from, and STAIt is a
>>>>> continuous measure representing degree of a specific personality trait
>>>>> of the subjects. Subject is an ID number for the individuals the data
>>>>> was sampled from.
>>>>> 
>>>>> Before glmmLasso I am running:
>>>>> 
>>>>> KNov$Subject <- factor(KNov$Subject)
>>>>> 
>>>>> to ensure the subject ID is not treated as a continuous variable.
>>>>> 
>>>>> If I run:
>>>>> 
>>>>> glm1 <- glmmLasso(Activity~as.factor(Novelty) + as.factor(Valence) +
>>>>> STAIt + as.factor(ROI)
>>>>> + as.factor(Valence):as.factor(ROI), list(Subject=~1), data = KNov,
>>>>> lambda=10)
>>>>> summary(glm1)
>>>>> 
>>>>> I don't get any warning messages, but the output contains b estimates
>>>>> only, no SE or p-values.
>>>>> 
>>>>> If I try to include a 3-way interaction, such as:
>>>>> 
>>>>> glm2 <- glmmLasso(Activity~as.factor(Novelty) + as.factor(Valence) +
>>>>> STAIt + as.factor(ROI)
>>>>> + as.factor(Novelty):as.factor(Valence):as.factor(ROI),
>>>>> list(Subject=~1), data = Nov7T, lambda=10)
>>>>> summary(glm2)
>>>>> 
>>>>> I get the warnings:
>>>>> 
>>>>> Warning messages:
>>>>> 1: In split.default((1:ncol(X))[-inotpen.which], ipen) :
>>>>> data length is not a multiple of split variable
>>>>> 2: In lambda_vec * sqrt(block2) :
>>>>> longer object length is not a multiple of shorter object length
>>>>> 
>>>>> And again, I do get parameter estimates, and no SE or p-values.
>>>>> 
>>>>> If I include my continuous variable in any interaction, such as:
>>>>> 
>>>>> glm3 <- glmmLasso(Activity~as.factor(Novelty) + as.factor(Valence) +
>>>>> STAIt + as.factor(ROI)
>>>>> + as.factor(Valence):as.factor(ROI) + as.factor(Novelty):STAIt,
>>>>> list(Subject=~1), data = Nov7T, lambda=10)
>>>>> summary(glm3)
>>>>> 
>>>>> I get the error message:
>>>>> 
>>>>> Error in rep(control$index[i], length.fac) : invalid 'times' argument
>>>>> 
>>>>> and no output.
>>>>> 
>>>>> If anyone has an input as to (1) why I am not getting SE or p-values
>>>>> in my outputs (2) the meaning of there warnings I get when I include a
>>>>> 3-way variable, and if they are something to worry about, how to fix
>>>>> them and (3) how to fix the error message I get when I include my
>>>>> continuous factor in an interatction, I would be very appreciative.
>>>>> 
>>>>> Thanks!
>>>>> 
>>>>> Walker
>>>>> 
>>>>> ______________________________________________
>>>>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
>>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>>> PLEASE do read the posting guide
>>>>> http://www.R-project.org/posting-guide.html
>>>>> and provide commented, minimal, self-contained, reproducible code.
>>>> 
>>>> 
>>> 
>>> ______________________________________________
>>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>> 
>> David Winsemius
>> Alameda, CA, USA
>> 
> 
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

David Winsemius
Alameda, CA, USA