# [R] Help finding the proper function

Tom.O tom.olsson at dnbnor.com
Thu Oct 23 14:37:55 CEST 2008

```Ok, I'll try to be clearer.  I'll start from the beginning. I have a set of
samples that I'm going to use to model a proxy for a common property. This
property is that the samples are either in a "quiet" or "chaotic" state.
Both the quiet and chaotic state is modelled to be normal distributed. So
these samples are believed to be from a mixture of univariate normal
distributions. But some samples do not have this property and are believed
only to come from a "quiet" state and is believed to be from a univariate
normal distribution.

What I also know or assume to know is that when the samples that are drawn
from a mixture distribution change between distributions they do that
simultaneously or near simultaneously. So each sample have a probability "p"
of being in either state. But since some samples are from a univariate
distribution and some of the samples that are from a mixture distribution
don’t show a clear change they are no good at estimating the overall
probability of being in the "quiet" or "chaotic" state.

What I'm looking for is the combination of samples that would give med the
best proxy to model the overall state, some sort of optimizer.

So hopefully this clarifies my problem.

Regards Tom

Rolf Turner-3 wrote:
>
>
> Your terminology is confused.  At least it confuses me. I think you are
> mixing up ``bivariate distributions'' and ``mixture'' (of two)
> distributions.
>
> What you get by rbinding x.1 and x.2 is a sample from a mixture of
> two *bivariate*
> (Gaussian) distributions, one with mean c(0,0), and one with mean c
> (3,4).
>
> Of course x.3 is a sample from a single *bivariate* Gaussian
> distribution with
> mean c(0,0).
>
> In both cases the covariance matrix is presumably the identity, since
> no covariance
> matrix is specified.
>
> So your final result X.1 has columns which are independent samples
> from univariate
> distributions, the first of which is a mixture of N(0,1) and N(3,1),
> the second
> of which is a mixture of N(0,1) and N(4,1), and the third and fourth
> of which are
> both N(0,1).
>
> Are you really interested in
>
> 	* bivariate distributions?
>
> 	* mixtures of (two) univariate distributions?
>
> 	* mixtures of (two) bivariate distributions?
>
> If the latter option, why are you talking about the columns of X.1
> individually?
> If the middle option, why are you using rmvnorm() at all?
> If the first option, what exactly is your question?
>
> Your thinking seems to be very muddy.  You will need to clarify it
> considerably.
> If you do so, you may be able to pose a meaningful question, and
> it yourself.
>
> 	cheers,
>
> 		Rolf Turner
>
>
> On 23/10/2008, at 11:59 AM, Tom.O wrote:
>
>>
>> This might not be the correct forum for this question for there
>> might be some
>> flaws in my logic so the R function I'm looking for might not be the
>> correct, but I know there’s a lot of smart people in this forum so
>> correct me if I'm wrong. I have been googling and searching in this
>> forum
>> for something useful but so far I'm out of luck.
>>
>> This is the background to my problem. I have a set of samples that
>> I know
>> are either from a normal distribution or a bivariate normal
>> distribution and
>> my goal is to find the combination of samples that would give the
>> best fit
>> of a bivariate distribution.
>>
>> I'm currently using maximum likelihood estimation to fit the bivariate
>> normal model but this is where my problems start. How do I find in an
>> efficient way the best combination, is there a function would do
>> the trick
>> for me?
>>
>> One solution is to run an exhaustive search, but this would take a
>> while
>> since the possible combinations is huge. So hopefully this is my last
>> option.
>>
>> My other problem is what test should I use to rank the models,
>> WALD, F-test
>> or likelihood ratio-test (LR-test)? My colleague thought that the
>> LR-test
>> would be the best to use, but he was not sure. And in that case which
>> function is best to use. I have found some LR tests but they use
>> fits from
>> glm models etc.
>>
>> Here is an example of my problem.
>> library(mixtools)
>>
>> x.1<-rmvnorm(40, c(0, 0))
>> x.2<-rmvnorm(60, c(3, 4))
>> x.3<-rmvnorm(100, c(0, 0))
>> X.1<-cbind(rbind(x.1, x.2),x.3)
>> colnames(X.1) =LETTERS[1:4]
>>
>> sample A and B is bivariate and C and D is not, so theoretically
>> the best
>> combination would be to use A and B in the model since they change
>> at the
>> same time, but other combinations with a bivariate and non bivariate
>> combinations would also work but should give a worse fit than A and
>> B. And
>> the worst case would be to fit a bivariate distribution to C and D.
>>
>> So this is the case...
>>
>> Regards Tom
>>
>> --
>> View this message in context: http://www.nabble.com/Help-finding-
>> the-proper-function-tp20121371p20121371.html
>> Sent from the R help mailing list archive at Nabble.com.
>>
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>
> ######################################################################
> Attention:
> This e-mail message is privileged and confidential. If you are not the
> intended recipient please delete the message and notify the sender.
> Any views or opinions presented are solely those of the author.
>
> This e-mail has been scanned and cleared by MailMarshal
> www.marshalsoftware.com
> ######################################################################
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help