[BioC] Tuning segment() in DNAcopy package to get more segments on each chromosome

Qian Liu littleduck24 at gmail.com
Mon Nov 14 21:50:34 CET 2011


Dear Sean,
Thanks for your reply.
"window" is from the literature Biostatistics. 2004 Oct;5(4):557-72.
Circular binary segmentation for the analysis of array-based DNA copy
number data.
Olshen AB, Venkatraman ES, Lucito R, Wigler M. On page 560 "the
permutation approach is computationally intensive. ...Our solution is
to divide the data into K overlapping windows...." The reason I ask is
that I dont understand how "kmax" and "nmin" in segment() are used in
calculating segments, and where do you set the window size in the
segment(). I really appreciate it if someone explain this to me.

Thanks,
Qian


On Mon, Nov 14, 2011 at 3:20 PM, Sean Davis <sdavis2 at mail.nih.gov> wrote:
> On Mon, Nov 14, 2011 at 3:13 PM, Qian Liu <littleduck24 at gmail.com> wrote:
>> Dear all,
>> I am trying to tune segment() in DNAcopy package to generate more
>> segments. Currently I have few segments in each chromosome (min 3
>> segments max-7).
>> I want to have a least 20 segments in each chromosome. min 5 markers
>> in each segment. also the number of markers in moving window =1000.
>>
>> My 1st question is how the parameter kmax and nmin work in the cbs
>> algorithm? I read the literature, but I couldn't figure it out.
>> My 2nd question is in the segment function min.width=5 for min 5
>> markers in each segment. So how to set the window size = 1000?
>>
>> kmax is the maximum width of smaller segment for permutation in the
>> hybrid method.
>> nmin is the minimum length of data for which the approximation of
>> maximum statistic is used under the hybrid method. should be larger
>> than 4*kmax
>>
>> segment(x, alpha = 0.01, nperm = 10000, p.method =
>>                    "hybrid", min.width=5, kmax=25, nmin=200,
>>                    undo.splits =  "prune",undo.prune=0.05,
>>                    verbose=1)
>
> Hi, Qian.
>
> Try making alpha larger (0.05 or 0.1).  Leave undo.splits="none".  If
> I recall, both of those will lead to more breaks.  You should be
> evaluating the results by eye, though, to be sure that more breaks is
> actually justified by the data.  I am not used to thinking along the
> lines of "I want to see XXX number of breaks per chromosome".
>
> I am not sure what you mean by "window", so I can't help you there.
>
> Sean
>



More information about the Bioconductor mailing list