[BioC] Tuning segment() in DNAcopy package to get more segments on each chromosome
littleduck24 at gmail.com
Mon Nov 14 21:50:34 CET 2011
Thanks for your reply.
"window" is from the literature Biostatistics. 2004 Oct;5(4):557-72.
Circular binary segmentation for the analysis of array-based DNA copy
Olshen AB, Venkatraman ES, Lucito R, Wigler M. On page 560 "the
permutation approach is computationally intensive. ...Our solution is
to divide the data into K overlapping windows...." The reason I ask is
that I dont understand how "kmax" and "nmin" in segment() are used in
calculating segments, and where do you set the window size in the
segment(). I really appreciate it if someone explain this to me.
On Mon, Nov 14, 2011 at 3:20 PM, Sean Davis <sdavis2 at mail.nih.gov> wrote:
> On Mon, Nov 14, 2011 at 3:13 PM, Qian Liu <littleduck24 at gmail.com> wrote:
>> Dear all,
>> I am trying to tune segment() in DNAcopy package to generate more
>> segments. Currently I have few segments in each chromosome (min 3
>> segments max-7).
>> I want to have a least 20 segments in each chromosome. min 5 markers
>> in each segment. also the number of markers in moving window =1000.
>> My 1st question is how the parameter kmax and nmin work in the cbs
>> algorithm? I read the literature, but I couldn't figure it out.
>> My 2nd question is in the segment function min.width=5 for min 5
>> markers in each segment. So how to set the window size = 1000?
>> kmax is the maximum width of smaller segment for permutation in the
>> hybrid method.
>> nmin is the minimum length of data for which the approximation of
>> maximum statistic is used under the hybrid method. should be larger
>> than 4*kmax
>> segment(x, alpha = 0.01, nperm = 10000, p.method =
>> "hybrid", min.width=5, kmax=25, nmin=200,
>> undo.splits = "prune",undo.prune=0.05,
> Hi, Qian.
> Try making alpha larger (0.05 or 0.1). Leave undo.splits="none". If
> I recall, both of those will lead to more breaks. You should be
> evaluating the results by eye, though, to be sure that more breaks is
> actually justified by the data. I am not used to thinking along the
> lines of "I want to see XXX number of breaks per chromosome".
> I am not sure what you mean by "window", so I can't help you there.
More information about the Bioconductor