[R] Help with a (g)lmer code

Sun Jun 14 12:30:39 CEST 2020

Hi Saudi,
Apologies for the delay. (also returning to the list)
In your initial code:

model1<- lmer (better ~ gender + age + education + WF + (1 | part),
> data=sub_data)

you have age as a fixed effect and there are also 36 levels. This
probably causing the error you describe above and I have changed it to
a random factor. Your response variable is "better", which has the
same levels as WF and is not numeric. This looks like a mistake. I
have written four models with the "hum" and "cul" variables as
response variables. This looks more sensible to me. The levels of the
"education" variable are not ordered correctly. The following code
runs okay, but there is a singular fit for EA_cul. The effects seem to
be of education, except for the EA_cul model. The following may get
you started:

sub_data<-read.csv("sub_data.csv",stringsAsFactors=FALSE)
# get the education factor into the correct order
sub_data$education<-factor(sub_data$education,
 levels=c("seconadry or below","university","postgrad"))
library(lme4)
modelSA_hum<-lmer(SA_hum~gender+education+WF+(1|age),data=sub_data)
modelSA_cul<-lmer(SA_cul~gender+education+WF+(1|age),data=sub_data)
modelEA_hum<-lmer(EA_hum~gender+education+WF+(1|age),data=sub_data)
modelEA_cul<-lmer(EA_cul~gender+education+WF+(1|age),data=sub_data)
summary(modelSA_hum)
summary(modelSA_cul)
summary(modelEA_hum)
summary(modelEA_cul)
# look at the distribution of responses
table(sub_data$SA_hum)
table(sub_data$SA_cul)
table(sub_data$EA_hum)
table(sub_data$EA_cul)

Jim

On Sun, Jun 14, 2020 at 9:42 AM Saudi Sadiq <saudisadiq using gmail.com> wrote:
>
> Hi Jim,
> Hope you are safe and sound.
> So sorry to bother you again. I am still waiting for your reply after I have attached the dataset.
> I know you are very busy, but I will appreciate it a lot if you can guide me in how to make the g(lmer) mkdel work, or guide me to something different.
> All the best
>
> ---------- Forwarded message ---------
> From: Saudi Sadiq <saudisadiq using gmail.com>
> Date: Fri, 12 Jun 2020, 4:18 pm
> Subject: Re: [R] Help with a (g)lmer code
> To: Jim Lemon <drjimlemon using gmail.com>
> Cc: r-help mailing list <r-help using r-project.org>
>
>
> Hi Jim,
>
> So many thanks for your reply. I actually made a mistake in presenting the problem; I should have clarified that the 1-10 linear scale questions went as: 10 most humorous/closest to Egyptian culture and 1 the least. Also, I should have attached some examples so the participant issue could be clear. Here is attached the dataset (if there is no problem or I am not going against the rules of the R-help group).
>
> Actually, I wanted better to be the only dependent factor and asking participants 'which subtitle is better?' could be enough, but I wanted to have detailed information of why a subtitle is better by asking participants specific questions (regarding which subtitle is more humorous and closer to Egyptian culture). Most of the time, the total of the hum + cul = better, but sometimes it is not (e.g. the sum for subtitle EA could be bigger than for SA, but the participant prefers SA in the better column).
>
> The WF (watched first) is the mode via which participants watched the two subtitles; some participants watched the SA subtitle first and other watched the EA first.
>
> Does this make sense?
>
> All the best
>
>
> On Thu, 11 Jun 2020 at 05:24, Jim Lemon <drjimlemon using gmail.com> wrote:
>>
>> Hi Saudi,
>> I can only make a guess, but that is that a variable having a unique
>> value for each participant has been read in as a factor. I assume that
>> "better" is some combination of "hum" and "cul" and exactly what is
>> WF?
>>
>> Jim
>>
>> On Thu, Jun 11, 2020 at 5:27 AM Saudi Sadiq <saudisadiq using gmail.com> wrote:
>> >
>> > Dear Sir/Madam,
>> >
>> > Hope everyone is safe and sound. I appreciate your help a lot.
>> >
>> > I am evaluating two Arabic subtitles of a humorous English scene and asked
>> > 263 participants (part) to evaluate the two subtitles (named Standard
>> > Arabic, SA, and Egyptian Arabic, EA) via a questionnaire that asked them to
>> > rank the two subtitles in terms of how much each subtitle is
>> >
>> > 2) more humorous (hum),
>> >
>> > 5) closer to Egyptian culture (cul)
>> >
>> >
>> >
>> > The questionnaire contained two 1-10 linear scale questions regarding the 2
>> > points clarified, with 1 meaning the most humorous and closest to Egyptian
>> > culture, and 1 meaning the least humorous and furthest from Egyptian
>> > culture. Also, the questionnaire had a general multiple-choice question
>> > regarding which subtitle is better in general (better). General information
>> > about the participants were also collected concerning gender (categorical
>> > factor), age (numeric factor) and education (categorical factor).
>> >
>> > Two versions of the questionnaire were relied on: one showing the ‘SA
>> > subtitle first’ and another showing the ‘EA subtitle first’. Nearly half
>> > the participants answered the first and nearly half answered the latter.
>> >
>> > I am focusing on which social factor/s lead/s the participants to evaluate
>> > one of the two subtitles as generally better and which subtitle is more
>> > humorous and closer to Egyptian culture. Each of these points alone can be
>> > the dependent factor, but the results altogether can be linked.
>> >
>> > I thought that mixed effects analyses would clarify the picture and answer
>> > the research questions (which  factor/s lead/s participants to favour a
>> > subtitle over another?) and, so,  tried the lme4 package in R and ran many
>> > models but all the codes I have used are not working.
>> >
>> > I ran the following codes, which yielded Error messages, like:
>> >
>> > model1<- lmer (better ~ gender + age + education + WF + (1 | part),
>> > data=sub_data)
>> >
>> > Error: number of levels of each grouping factor must be < number of
>> > observations (problems: part)
>> >
>> >
>> >
>> > Model2 <- glmer (better ~ gender + age + education + WF + (1 | part), data
>> > = sub_data, family='binomial')
>> >
>> > Error in mkRespMod(fr, family = family) :
>> >
>> >   response must be numeric or factor
>> >
>> >
>> >
>> > Model3 <- glmer (better ~ age + gender + education + WF + (1 | part), data
>> > = sub_data, family='binomial', control=glmerControl(optimizer=c("bobyqa")))
>> >
>> > Error in mkRespMod(fr, family = family) :
>> >
>> >   response must be numeric or factor
>> >
>> >
>> >
>> > Why does the model crash? Does the problem lie in the random factor part (which
>> > is a code for participants)? Or is it something related to the mixed
>> > effects analysis?
>> >
>> > Best
>> > Saudi Sadiq
>> >
>> >         [[alternative HTML version deleted]]
>> >
>> > ______________________________________________
>> > R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> > https://stat.ethz.ch/mailman/listinfo/r-help
>> > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> > and provide commented, minimal, self-contained, reproducible code.
>
>
>
> --
> Saudi Sadiq,
>
> Lecturer, Minia University, Egypt
>
> Academia, Reserachgate, Google Scholar, Publons
>
> Certified Translator by (Egyta)
>
> Associate Fellow of the Higher Education Academy, UK