# [R] Error on mclust

Christian Hennig hennig at stat.math.ethz.ch
Wed Aug 6 11:40:52 CEST 2003

```Hi,

this goes also to the developer and maintainer of mclust.

Here are some reproducible results:

> set.seed(300)
> x <- c(rnorm(50),rnorm(56,-0.5))
> Mclust(x)

> set.seed(3000)
> y <- c(rnorm(50),rnorm(56,-0.5))
> Mclust(y)

best model: unequal variance with 3 groups

averge/median classification uncertainty: 0.002 / 0

Now what's the difference between x and y?
function EMclust works for both, but not summary.EMclust:

> emx <- EMclust(x)
> emx

BIC:
E         V
1 -304.8738 -304.8738
2 -314.2134        NA
3 -323.5445        NA
4 -332.8736        NA
5 -342.2043        NA
6 -351.5322        NA
7 -360.8612        NA
8 -370.1896        NA
9 -379.5153        NA

> help(summary.EMclust)
> summary(emx,x)
> emy <- EMclust(y)
> summary(emy,y)

classification table:

1   2   3
100   4   2

uncertainty (quartiles):
0%        25%        50%        75%       100%
0.00000000 0.00000000 0.00000000 0.00000000 0.06683614

best BIC values:
V,3       V,2       E,1
-293.2526 -302.0302 -304.2835

best model: unequal variance

> emy

BIC:
E         V
1 -304.2835 -304.2835
2 -313.6275 -302.0302
3 -322.9562 -293.2526
4 -332.2631        NA
5 -341.5896        NA
6 -350.8819        NA
7 -352.6121        NA
8 -361.9393        NA
9 -371.2658        NA

Here is the explanation:

For x, one class G=1 is chosen as optimal.
The line rep(1,n) appears in summary.EMclust (which is called by Mclust,
but not by EMclust), but it is called only if G=1 is estimated.
Indeed, n is not defined (as far as I can see), which is a bug.

It's possible to work around that bug:
For concrete data analysis, if the error occurs,
you may assume (and check by application of
EMclust), that G=1 is estimated, which means that you can fit a single
normal distribution, for which you do not need Mclust.
Otherwise you should get proper results.

Best,
Christian

On Tue, 5 Aug 2003, weidong zhang wrote:

> Hi All,
>
>
> I am trying to cluster a one-dimensional data (see the file attached) using
> Mclust() but got an error message like:
> >Mclust(x)
>
> When I do a simulation sometimes it works sometimes doesn't.
>
> >Mclust(c(rnorm(50),rnorm(56,-0.5)))
>
> >Mclust(c(rnorm(56),rnorm(56,-0.5)))
>
> best model: unequal variance with 2 groups
>
> averge/median classification uncertainty: 0.001 / 0
>
> Can anybody help me with this? Thanks.
>
> Weidong
>
> _________________________________________________________________
>
>
>

--
***********************************************************************
Christian Hennig
Seminar fuer Statistik, ETH-Zentrum (LEO), CH-8092 Zuerich (currently)
and Fachbereich Mathematik-SPST/ZMS, Universitaet Hamburg
hennig at stat.math.ethz.ch, http://stat.ethz.ch/~hennig/
hennig at math.uni-hamburg.de, http://www.math.uni-hamburg.de/home/hennig/
#######################################################################
ich empfehle www.boag.de

```