[R] Fitting a Mixture of Noncentral Student t Distributions to a one-dimensional sample

Johannes Moser jzmoser at gmail.com
Wed Apr 30 22:30:34 CEST 2014


Dear R community,

I`d like to extract the parameters of a two-component mixture 
distribution of noncentral student t distributions which was fitted to a 
one-dimensional sample.

There are many packages for R that are capable of handling mixture 
distributions in one way or another. Some in the context of a Bayesian 
framework requiring kernels. Some in a regression framework. Some in a 
nonparametric framework. ...

So far the "mixdist"-package seems to come closest to my wish. This 
package fits parametric mixtures to a sample of data. Unfortunately it 
doesn`t support the student t distribution.

I have also tried to manually set up a likelihood function as described 
here:
http://stackoverflow.com/questions/6485597/r-how-to-fit-a-large-dataset-with-a-combination-of-distributions
But the result is far from perfect.

The "gamlss.mx"-package might be helping, but originally it seems to be 
set up for another context, i.e. regression. I tried to regress my data 
on a constant and then extract the parameters for the estimated mixture 
error distribution. But the estimated parameters seem to be not directly 
accessable individually by some command (such as fit1$sigma). And there 
seem to be serious convergence problems even in pretty simple and 
nonambiguous cases (see example 2). The following syntax is my 
gamlss.mx-setup so far:


     library(gamlss.dist)
     library(gamlss.mx)
     library(MASS)

     # 1:
     data(geyser)
     plot(density(geyser$waiting) )
     fit1 <- gamlssMX(waiting~1,data=geyser,family="TF",K=2)
     fit1
     # works fine

     # 2:
     N <- 100000
     components <- sample(1:2,prob=c(0.6,0.4),size=N,replace=TRUE)
     mus <- c(3,-6)
     sds <- c(1,9)
     nus <- c(25,3)
     mixsim <- 
data.frame(rTF2(N,mu=mus[components],sigma=sds[components],nu=nus[components]))
     colnames(mixsim) <- "MCsim"
     plot(density(mixsim$MCsim) , xlim=c(-50,50))
     fit2 <- gamlssMX(MCsim~1,data=mixsim,family="TF",K=2)
     fit2
     # no convergence

With another dataset and when using the same two component densities for 
the mixture as above I ended up with negative estimates for sigma (which 
should be positive).

I would be very grateful for any advice. I`ve read through many manuals 
and vignettes today but it seems that I am nearly in the same place 
where I was this morning.
A small example for a setup that works sort of reliably would be fantastic!

Thanks a lot in advance!!
Johannes



More information about the R-help mailing list