[R] Which model to keep (negative BIC)

plummer at iarc.fr plummer at iarc.fr
Sun Apr 5 11:24:27 CEST 2009


Quoting "cladoo.26" <cladoo.26 at laposte.net>:
> Hi,
>
> My questions concern the function 'mclustBIC' which compute BIC for a range
> of clusters of several models on the given data and the other function
> 'mclustModel' which choose the best model and the best number of cluster
> accordind to the results of the previous cited function.
>
> 1) When trying the following example (see ?mclustModel), I get negative BIC
> computed by 'mclustBIC', and the best model according to the results of
> 'mclustModel' is the one with the highest BIC (i.e. the closer to zero).
>
> irisBIC <- mclustBIC(iris[,-5])
> plot(irisBIC)
> mclustModel(iris[,-5], irisBIC)
>
> Because I don't find anything about this point, could someone confirm that
> when the BIC are positive, we try to the minimize the criterion (the model
> with the smallest BIC is the best one) but when the BIC are negative we look
> for the higher BIC (the model with a the BIC closest to zero is the best one)
> ?

The mclust package seems to be using a definition of BIC that is the
negative of the usual one, i.e. the bic() function in the mclust package
returns

    2 * loglik - nparams * log(n)

where "loglik" is the log likelihood, "n" is the number of observations
and "nparams" is the number of parameters.

BIC is normally defined as

   -2 * loglik + nparams * log(n)

and the optimal model is the one with the minimum BIC. However in this
case, you want to maximize it.


> 2) Does the $G argument from the output of  'mclustModel' represent the best
> number of clusters for the chosen model ?

According to the documentation it does, and you can verify from your
plot that the VEV model with 2 components has maximum "BIC"

> Many thanks, this is my first post on R help, but I often consult the forum
> for 4 years.
>
> Cladoo
>



-----------------------------------------------------------------------
This message and its attachments are strictly confidenti...{{dropped:8}}




More information about the R-help mailing list