[R] Account for a factor variability in a logistic GLMM in lme4
Martin Maechler
m@ech|er @end|ng |rom @t@t@m@th@ethz@ch
Tue Sep 4 11:28:12 CEST 2018
>>>>> Jim Lemon
>>>>> on Tue, 4 Sep 2018 08:36:22 +1000 writes:
> Hi Pedro,
> I have encountered similar situations in a number of areas. Great care
> is taken to record significant events of low probability, but not the
> non-occurrence of those events. Sometimes this is due to a problem
> with the definition of non-occurrence. To use your example, how close
> does an animal have to approach the crossing to be counted as not
> crossing? Perhaps it was just a failure to record the species of
> animals that didn't cross. In that case you have a problem, because
> the probability of crossing within species cannot be estimated from
> the data you describe.
> Jim
Indeed!
For those among us too young to remember:
The 1986 Space shuttle Challenger catastrophe was co-caused by
that mistake: Only considering the '1's and not considering the
'0's in the data (visualised and shown to the decision making experts).
See, e.g.,
https://priceonomics.com/the-space-shuttle-challenger-explosion-and-the-o/
(couldn't easily find a more academic / reliable source which
*does* include the graphics)
Martin Maechler
ETH Zurich
> On Tue, Sep 4, 2018 at 12:43 AM Pedro Vaz <zasvaz using gmail.com> wrote:
>>
>> We did a field study in which we tried to understand which factors
>> significantly explain the probability of a group of animals (5 species in
>> total) crossing through 30 wildlife road-crossing structures. The response
>> variable is binomial (yes=crossed; no = did not cross) and was recorded by
>> animal species. We did about 30 visits to each crossing structure (our
>> random factor) in which we recorded the binomial response by each animal
>> species and the values of a few predictors.
>>
>> So, I have this (simplified for better understanding) mixed effects model:
>> library (lme4)
>>
>> Mymodel <- glmer(cross.01 ~ stream.01 + width.m + grass.per + (1|structure.id),
>> data = Mydata, family = binomial)
>>
>> stream is a factor with 2 levels; width.m is continuous; grass.per is a
>> percentage
>>
>> This is the model in which I assessed crossings by all species combined
>> (i.e., cross. 01 = 1 when an animal of any species crossed, cross.01 = 0
>> when no animal crossed). However, we did one model per species and those
>> species-specific models highlight that different species exhibit different
>> relationships between crossings and explanatory variables.
>>
>> My problem: This means that my model above suffers from an additional
>> source of variation related to the species level without accounting for it.
>> However I cannot recalibrate the above model adding the species level as
>> random factor because, in my binomial response, the zero means no species
>> crossed (all zeros would have "NA" or, say, "none" for species) and so that
>> additional source of variation is only present when the response was 1.
>> Just to confirm this, I did add species as a random factor:
>>
>> (1 | structure.id) + (1 | species)
>>
>> As expected, the message is "Error: Response is constant"
>>
>> How can I account for the species variability in my model in lme4?
>>
>> A few more details:
>> A few more details:
>> - I had 5 mammal species crossing through the 30 road-crossing structures.
>> In 134 occasions (i.e., 134 of my records on individual
>> crossing-structures), no animal crossed (so, @Dimitris Rizopoulos, no, I
>> didn't have the species of the animals which did not cross. A "no cross"
>> was a "zero" for that visit to the crossing-structure). In 498 occasions,
>> at least one animal of a given species crossed the structure (these were my
>> "ones" in my logistic response)
>> - A side comment: This is to respond to a reviewer in a paper of mine,
>> i.e., I did and presented species-specific and "all combined species"
>> models in the draft reviewed but now the reviewer is asking me to control
>> for the species variability in the "combined species model". He asked me to
>> include a random factor but I realized that is not possible since all my
>> zeros would have "none" for the species that crossed. So, is it possible to
>> control for the species variability in my model in lme4 in another way? I
>> know in nlme including a fitting of variance structures it's not that
>> difficult...
>> - Every time an animal crossed, the binary response was "one" and I
>> recorded the animal species as well. Thus, I have variability between
>> species in the "ones" but not in my "zeros" of my logistic model.
>>
>> [[alternative HTML version deleted]]
>>
>> ______________________________________________
>> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
> ______________________________________________
> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
More information about the R-help
mailing list