[BioC] initializing DirichletMultinomial::dmn

Charles C. Berry ccberry at ucsd.edu
Fri Jul 11 17:46:00 CEST 2014


On Fri, 11 Jul 2014, Martin Morgan wrote:

> On 07/10/2014 02:45 PM, Charles Berry wrote:
>> 
>> I'd like to be able to specify the starting 'centers' for dmn().
>>

[snip]

> I'll look into this, thanks for the suggestion. Is there a more general issue 
> that makes the random centers choice a poor one? And presumably setting the 
> random number seed allows for replication (I think that's a 'this is the way 
> it should work' rather than a statement of fact...).
>

Thanks, Martin.

There is another issue. The data may have distinct samples that are 
duplicates. In my case, there are thousands of sparse multinomial samples 
(even thousands with N==1) and loads of duplicate rows in 'count'. If 
the random centers are a sample of the rows, then it may contain 
duplicates and some values of p_j that are zero. So sampling from the rows 
will fail.

I don't know if problems will arise with centers that are randomly chosen 
from the space of the multinomial parameter pi, but if something is known 
about the structure there might be a smart way to choose starting values 
that is based on the data. If one is particularly interested in knowing if 
the multinomial parameter concentrates near certain edges or vertices of 
pi, then setting starting centers near them might be indicated to be sure 
that that part of the space has been given a try.

So I was thinking that having the flexibility to set ones own initial 
values might be useful as long as one does not make a pathological choice.

Best,

Chuck

> Martin
>
>> 
>> Best,
>> 
>> Chuck
>> 
>> _______________________________________________
>> Bioconductor mailing list
>> Bioconductor at r-project.org
>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>> Search the archives: 
>> http://news.gmane.org/gmane.science.biology.informatics.conductor
>> 
>
>
> -- 
> Computational Biology / Fred Hutchinson Cancer Research Center
> 1100 Fairview Ave. N.
> PO Box 19024 Seattle, WA 98109
>
> Location: Arnold Building M1 B861
> Phone: (206) 667-2793
>

Charles C. Berry                            Dept of Family/Preventive Medicine
cberry at ucsd edu			    UC San Diego
http://famprevmed.ucsd.edu/faculty/cberry/  La Jolla, CA 92093-0901



More information about the Bioconductor mailing list