[BioC] GC-content sensitive normalization of Affymetrix tiling arrays for ChIP-chip

Sean Davis sdavis2 at mail.nih.gov
Fri Jul 18 14:09:18 CEST 2008

On Fri, Jul 18, 2008 at 5:32 AM, Christian Feller
<feller.christian at gmail.com> wrote:
> Dear Wolfgang,
> thank you for your reply!
> My goal is to compare my own ChIP-chip data (Nimblegen tiling) with some
> other ChIP-chip data (created on Affymetrix tiling). I normalized my data
> with vsn and got some nice signal-to-noise ratios (visual inspection,
> replicates show same trend). When I normalize with other algorithms (loess,
> quantile, Tukey-biweight) I get a similar output (based on visual inspection
> and correlation among them).
> Now, I normalized the Affymetrix data with vsn and got some terrible
> signal-to-noise ratios. One possible explanation might be the shorter probe
> sequence of the Affy probes compared to the Nimblegen probes. Fluorescence
> signals of shorter probes are more sensitive to the underlying sequence (in
> particular GC-content). Because vsn does not account for the GC-content I
> reasoned to try to adjust for it (therefore, I thought about using GCRMA).

I assume that the Nimblegen data are two-color?  If so, that accounts
for the vast majority of the differences you observe, I would imagine.
 If not, then for single-color nimblegen arrays, I would expect that
GC correction would be useful, also.  However, such a correction
probably does not need to account for the base positions, but only the
GC count.


> I will try to use the normalizeByReference function and report back when it
> works.
> Thanks again!
> Best wishes,
> Christian
> -----Original Message-----
> From: Wolfgang Huber [mailto:huber at ebi.ac.uk]
> Sent: Friday, July 11, 2008 12:55 AM
> To: Christian Feller
> Cc: 'Sean Davis'; bourgon at ebi.ac.uk; bioconductor at stat.math.ethz.ch
> Subject: Re: [BioC] GC-content sensitive normalization of Affymetrix tiling
> arrays for ChIP-chip
> Dear Christian,
> few points:
> - afaIu the background correction method of GC-RMA does not make use of
> probe sets, it works on individual probes. Probe sets only come into
> play later, for the expression estimate. But getting it to work for your
> use case may be a hard problem (has anyone on the list managed?)
>  - vsn2 does not do probe-sequence specific adjustments, so I am not
> sure why it was mentioned in this context.
> - the choice of language should be secondary to these criteria: quality
> of the underlying science and of the implementation.
> - you say "how can I take into accound (sic) the GC-effect of single
> probes", but would it make sense to take a step back and tell us why you
> want to do that and what you want to achieve? Perhaps your answer is
> somewhere else.
> - the normalizeByReference function in the tilingArray package offers a
> method to do probe(sequence)-specific background correction for
> Affymetrix tiling array data, and is described in a paper [1], but I
> have only used it on RNA expression data, not on ChIP, so porting it to
> that application would need some care.
> [1] http://bioinformatics.oxfordjournals.org/cgi/reprint/22/16/1963.pdf
>  Best wishes
>        Wolfgang
> Christian Feller wrote:
>> Hi Sean,
>> Thank you for your quick response! We successfully used MAT under Python
> for a dataset with 3 control arrays (hybridized with input) and 3 IP arrays
> (all biological replicates). In comparison with vsn2, probe standardization
> via MAT significantly increased the signal-to-noise ratio. However, we have
> still some doubts about the reliability of those results since the raw data
> seem to be very noisy, and the correlation of the biological replicates is
> not very strong.
>> Thanks again!
>> Best
>> Christian
>> -----Original Message-----
>> From: seandavi at gmail.com [mailto:seandavi at gmail.com] On Behalf Of Sean
> Davis
>> Sent: Wednesday, July 09, 2008 2:04 AM
>> To: Christian Feller
>> Cc: bioconductor at stat.math.ethz.ch; bourgon at ebi.ac.uk
>> Subject: Re: [BioC] GC-content sensitive normalization of Affymetrix
> tiling arrays for ChIP-chip
>> On Tue, Jul 8, 2008 at 6:58 PM, Christian Feller
>> <feller.christian at gmail.com> wrote:
>>> Dear Richard Bourgon and list,
>>> I am a newbie in analyzing ChIP-chip Affymetrix tiling arrays (GeneChip
>>> Drosophila Tiling 1.0R Array).
>>> My question is how can I take into accound the GC-effect of single probes
> if
>>> I do not have expression sets (due to the nature of a tiling array)? We
> had
>>> the idea of taking a fixed window size, defining the probes within them
> as a
>>> "probeset", and using GCRMA for background correction/normalization. In
>>> addition, can we use this configuration (normalization via GCRMA) for
>>> profiles with broad ChIP-enriched regions (as it is the case for many
>>> histone modifications).
>>> If there are some additional advice especially for the pre-processing
> steps
>>> I would be very happy!
>>> Until now, we do the normalization using vsn2.
>> Hi, Christian.  Do you have the input DNA from which you are going to
>> form a ratio, or are you attempting to do a single-channel analysis?
>> If the latter, then you might look at MAT from Shirley Liu's group.  I
>> don't think it is available for R, but the algorithm could probably be
>> coded in R relatively easily.  There are likely other solutions.
>> Sean

More information about the Bioconductor mailing list