[R] Fitting a Mixture of Noncentral Student t Distributions to a one-dimensional sample

Thu May 8 10:08:49 CEST 2014

Hi Johannes,
Below code gives good results for me; note that trying multiple
starting is often important in fitting mixture models, even in simple
cases like this.
Note also that the sigma and nu parameters in gamlssMX are fitted on a
log scale, hence the possible occurrence of negative results.
hth, Ingmar

    # 2:
    N <- 2000
    components <- sample(1:2,prob=c(0.6,0.4),size=N,replace=TRUE)
    mus <- c(3,-6)
    sds <- c(1,9)
    nus <- c(25,3)
    mixsim <- data.frame(rTF2(N,mu=mus[components],sigma=sds[components],nu=nus[components]))
    colnames(mixsim) <- "MCsim"
    plot(density(mixsim$MCsim) , xlim=c(-50,50))

set.seed(2)
    fit2 <- gamlssMX(MCsim~1,data=mixsim,family="TF",K=2)
    fit2

On Wed, Apr 30, 2014 at 10:30 PM, Johannes Moser <jzmoser at gmail.com> wrote:
> Dear R community,
>
> I`d like to extract the parameters of a two-component mixture distribution
> of noncentral student t distributions which was fitted to a one-dimensional
> sample.
>
> There are many packages for R that are capable of handling mixture
> distributions in one way or another. Some in the context of a Bayesian
> framework requiring kernels. Some in a regression framework. Some in a
> nonparametric framework. ...
>
> So far the "mixdist"-package seems to come closest to my wish. This package
> fits parametric mixtures to a sample of data. Unfortunately it doesn`t
> support the student t distribution.
>
> I have also tried to manually set up a likelihood function as described
> here:
> http://stackoverflow.com/questions/6485597/r-how-to-fit-a-large-dataset-with-a-combination-of-distributions
> But the result is far from perfect.
>
> The "gamlss.mx"-package might be helping, but originally it seems to be set
> up for another context, i.e. regression. I tried to regress my data on a
> constant and then extract the parameters for the estimated mixture error
> distribution. But the estimated parameters seem to be not directly
> accessable individually by some command (such as fit1$sigma). And there seem
> to be serious convergence problems even in pretty simple and nonambiguous
> cases (see example 2). The following syntax is my gamlss.mx-setup so far:
>
>
>     library(gamlss.dist)
>     library(gamlss.mx)
>     library(MASS)
>
>     # 1:
>     data(geyser)
>     plot(density(geyser$waiting) )
>     fit1 <- gamlssMX(waiting~1,data=geyser,family="TF",K=2)
>     fit1
>     # works fine
>
>     # 2:
>     N <- 100000
>     components <- sample(1:2,prob=c(0.6,0.4),size=N,replace=TRUE)
>     mus <- c(3,-6)
>     sds <- c(1,9)
>     nus <- c(25,3)
>     mixsim <-
> data.frame(rTF2(N,mu=mus[components],sigma=sds[components],nu=nus[components]))
>     colnames(mixsim) <- "MCsim"
>     plot(density(mixsim$MCsim) , xlim=c(-50,50))
>     fit2 <- gamlssMX(MCsim~1,data=mixsim,family="TF",K=2)
>     fit2
>     # no convergence
>
> With another dataset and when using the same two component densities for the
> mixture as above I ended up with negative estimates for sigma (which should
> be positive).
>
> I would be very grateful for any advice. I`ve read through many manuals and
> vignettes today but it seems that I am nearly in the same place where I was
> this morning.
> A small example for a setup that works sort of reliably would be fantastic!
>
> Thanks a lot in advance!!
> Johannes
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.