[R] NaN Log-lik value in EM algorithm (fitting Gamma mixture model)

Fri Sep 16 20:23:08 CEST 2016

You should report the issue to the author/maintainer of the mixtools
package.  gammamixEM can get into this situation when the data is not an
obvious mixture so it has a hard time coming up with a good starting point
for the coefficient estimates.  E.g.,

> out <- mixtools::gammamixEM(rep(c(0.0001, 0.8126, .8536, .8888,
1.0180),c(1,45,150,45,1)), lambda = c(1, 1, 1)/3)
Note: Choosing new starting values.
Note: Choosing new starting values.
Error in while (diff > epsilon && iter < maxit) { :
  missing value where TRUE/FALSE needed
In addition: Warning messages:
1: In dgamma(x, shape = alpha[j], scale = beta[j]) : NaNs produced
2: In dgamma(x, shape = alpha[j], scale = beta[j]) : NaNs produced
3: In dgamma(x, shape = alpha[j], scale = beta[j]) : NaNs produced
4: In dgamma(x, shape = alpha[j], scale = beta[j]) : NaNs produced
> out <- mixtools::gammamixEM(rep(c(0.0001, 0.8126, .8536, .8888,
1.0180),c(1,45,150,45,1)), lambda = c(1, 1, 1)/3)
Note: Choosing new starting values.
Note: Choosing new starting values.
Note: Choosing new starting values.
Note: Choosing new starting values.
Note: Choosing new starting values.
Note: Choosing new starting values.
Note: Choosing new starting values.
Error in while (diff > epsilon && iter < maxit) { :
  missing value where TRUE/FALSE needed
In addition: There were 14 warnings (use warnings() to see them)
> out <- mixtools::gammamixEM(rep(c(0.0001, 0.8126, .8536, .8888,
1.0180),c(1,45,150,45,1)), lambda = c(1, 1, 1)/3, alpha=c(.4, .9))
Error in while (diff > epsilon && iter < maxit) { :
  missing value where TRUE/FALSE needed
In addition: Warning messages:
1: In dgamma(x, shape = alpha[j], scale = beta[j]) : NaNs produced
2: In dgamma(x, shape = alpha[j], scale = beta[j]) : NaNs produced
3: In dgamma(x, shape = alpha[j], scale = beta[j]) : NaNs produced
> out <- mixtools::gammamixEM(rep(c(0.0001, 0.8126, .8536, .8888,
1.0180),c(1,45,150,45,1)), lambda = c(1, 1, 1)/3, alpha=c(.4, .9),
beta=c(1,1))
Error in while (diff > epsilon && iter < maxit) { :
  missing value where TRUE/FALSE needed

In the meantime, use tryCatch() or try() so your loop over all genes can do
all the other genes and return some sort of special value where the
estimation procedure fails.

Bill Dunlap
TIBCO Software
wdunlap tibco.com

On Fri, Sep 16, 2016 at 10:28 AM, Aanchal Sharma <aanchalsharma833 at gmail.com
> wrote:

> Data has no negative values. Values range from 0.001 to 1.01.
> Following is the summary, in case that helps:
>
>  Min. 1st Qu.  Median    Mean 3rd Qu.    Max.
>  0.0010  0.8126  0.8536  0.8464  0.8888  1.0180
>
> SD: 0.07489977
>
> Any clue?
>
>
>
> On Thu, Sep 15, 2016 at 10:32 PM, William Dunlap <wdunlap at tibco.com>
> wrote:
>
>> Does the data contain non-positive values?
>>
>> > out <- mixtools::gammamixEM(as.numeric(0:100), lambda = c(1, 1, 1)/3,
>> verb = TRUE)
>> iteration = 1  log-lik diff = NaN  log-lik = NaN
>> Error in while (diff > epsilon && iter < maxit) { :
>>   missing value where TRUE/FALSE needed
>>
>>
>> Bill Dunlap
>> TIBCO Software
>> wdunlap tibco.com
>>
>> On Thu, Sep 15, 2016 at 3:04 PM, Aanchal Sharma <
>> aanchalsharma833 at gmail.com> wrote:
>>
>>> I am using a function gammamixEM where it does it by default. I do not
>>> have
>>> the option to change it.
>>> Conceptually, what can make the algorithm not able to calculate
>>> likelihood
>>> value at all (and hence log-lik=Nan)? Is there sth wrong with the data?
>>> Under what conditions does it happen?
>>>
>>> On Wed, Sep 14, 2016 at 8:04 PM, Duncan Murdoch <
>>> murdoch.duncan at gmail.com>
>>> wrote:
>>>
>>> > On 14/09/2016 4:46 PM, Aanchal Sharma wrote:
>>> >
>>> >> Hi,
>>> >>
>>> >> I am trying to fit Gamma mixture model to my data (residual values
>>> >> obtained
>>> >> after fitting Generalized linear Model) using gammamixEM. It is part
>>> of
>>> >> the
>>> >> script which does it for multiple datasets in loop. The code is
>>> running
>>> >> fine for some datasets but it terminates for some giving following
>>> error:
>>> >>
>>> >> " iteration = 1  log-lik diff = NaN  log-lik = NaN
>>> >> Error in while (diff > epsilon && iter < maxit) { :
>>> >>   missing value where TRUE/FALSE needed"
>>> >>
>>> >> Seems like EM is not able to calculate log-lik value (NaN) at the
>>> first
>>> >> iteration itself. any idea why that can happen?
>>> >> It works fine for the other genes in the loop. Tried looking for
>>> >> difference
>>> >> in the inputs, but could not come up with anything striking.
>>> >>
>>> >>
>>> > THere are lots of ways to get NaN in numerical calculations.   A common
>>> > one if you are using log() to calculate log likelihoods is that
>>> rounding
>>> > error gives you a negative likelihood, and then log(lik) comes out to
>>> NaN.
>>> >
>>> > You just need to look really closely at each step of your calculations.
>>> > Avoid using log(); use the functions that build it in (e.g. instead of
>>> > log(dnorm(x)), use dnorm(x, log = TRUE)).
>>> >
>>> > Duncan Murdoch
>>> >
>>> >
>>>
>>>
>>> --
>>> Anchal Sharma, PhD
>>> Postdoctoral Fellow
>>> 195, Little Albany street,
>>> Cancer Institute of New Jersey
>>> Rutgers University
>>> NJ-08901
>>>
>>>         [[alternative HTML version deleted]]
>>>
>>> ______________________________________________
>>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide http://www.R-project.org/posti
>>> ng-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>>
>>
>
>
> --
> Anchal Sharma, PhD
> Postdoctoral Fellow
> 195, Little Albany street,
> Cancer Institute of New Jersey
> Rutgers University
> NJ-08901
>

	[[alternative HTML version deleted]]