[BioC] Circular binary segmentation for allele-specific CN data
cho at stat.berkeley.edu
Mon Aug 23 10:15:58 CEST 2010
Thank you for your constructive replies. I followed Kasper's
suggestion of using the package DNAcopy. My initial problem with
aroma.affymetrix was just that I could not figure out how to get my
data in the correct format to apply their function CbsModel. But
DNAcopy does the trick nicely with its function "segment."
Jan, thanks for the help with the fracB data. Weeks ago, I ran
TumorBoost to reduce the signal-to-noise ratio in the fracB data.
Then, I used aroma.affymetrix to extract all of the informative SNPs
(AB, BB). After this, I took these allele frequencies and subtracted
by 0.5, then I took the max of this difference and 0 (so I only
retained the bands above 0.5).
I ran CBS on this, even with the noise at 0 (after doing the
subtraction), since not everything that "should" be at 0.5 is actually
I'm comparing the regions it found with data from exome capture, and
it seemed to have performed satisfactorily.
Thanks again! I appreciate all of the help I've received on this
On Aug 23, 2010, at 12:49 AM, <J.Oosting at lumc.nl> <J.Oosting at lumc.nl>
> The problem with segmenting fracB data is that it does not behave as a
> certain value + noise in segments. This violates the assumption for
> segmentation algorithm.
> In normal samples neighboring SNPs can have fracB values of 0 (for AA
> genotype), 0.5 (AB) or 1 (BB).
> In the ideal situation you can filter out the uninformative SNPs
> using a paired normal sample. Then you have to transform the fracB
> the informative heterozygous SNPs data so it changes in 1 direction
> genomic rearrangements occur, and after that you can apply the CBS to
> the remaining data.
> Corver et.al. Cancer Res 2008; 68: (24). December 15, 2008 describes a
> method to transform fracB-like data so it can be segmented.
>> -----Original Message-----
>> From: bioconductor-bounces at stat.math.ethz.ch [mailto:bioconductor-
>> bounces at stat.math.ethz.ch] On Behalf Of Christine Ho
>> Sent: donderdag 19 augustus 2010 21:55
>> To: bioconductor at stat.math.ethz.ch
>> Subject: [BioC] Circular binary segmentation for allele-specific CN
>> Good afternoon!
>> I hope this question is not redundant - I've tried searching the
>> mailing list archives and doing a Google search.
>> I've just finished using aroma.affymetrix() to produce allele-
>> copy number estimates. So, right now, I've got the allele frequency
>> data, i.e. what the vignettes call "fracB" data. I would like to run
>> circular binary segmentation on this to find breakpoints (so I can
>> identify regions of LOH), but it seems that all of the related
>> packages on Bioconductor just segment aCGH data.
>> So, I was wondering: are the segmentation algorithms in these
>> (for ex., snapCGH) able to handle any dataset, or are they specific
>> aCGH data? If they are specific to aCGH data, would anyone happen to
>> know where I can obtain code (or better yet, a package) for running
>> CBS on any data set?
>> Thank you for your time! I really appreciate it.
>> Christine Ho
>> Graduate student in Statistics at UC Berkeley
>> E-mail: cho at stat.berkeley.edu
>> Bioconductor mailing list
>> Bioconductor at stat.math.ethz.ch
>> Search the archives:
More information about the Bioconductor