[BioC] Affy vs. cDNA : Low and not expressed genes

Laurent Gautier laurent at cbs.dtu.dk
Wed Jun 4 06:56:40 MEST 2003


On Tue, Jun 03, 2003 at 06:35:54PM +0800, Adaikalavan Ramasamy wrote:
> Dear all,
> 
> Thank you for the very interesting discussion on the topic of
> "replicates and low expression levels" in the last few day. I am facing
> a related problem regarding normalization and would appreciate any
> advice.
> 
> A small time course experiment was done on blood macrophages and
> hybridized to affymetrix chip HGU-133A. There were 3 replicates and 3
> time points (0, 2, 48 hour).
> 
> The main problem is that at time point 0 hour, there are 95 % Absent
> calls. The percentage of Absent call decreases to 70% in 2 hour and 20%
> in 2 days. Initially I assumed that there was some physical problem with
> the array. But later I was corrected by the biologists that it was
> expected as many genes are not expressed in blood macrophages. Thus most
> of the 95 % Absent were due to not expressed genes ... Apparently this
> is common in developmental biology.
> 
> My first question is how does one normalize this kind of data ? The
> assumption in two-colour cDNA data of "most of the genes are not
> differently expressed" does not hold here. Median normalization would
> not be meaningful in this scenario.

One strategy I have been using goes like:
- normalize the replicates from each time point independently
(with the affy package, use 'split.AffyBatch' and 'normalize'). The method
of normalization you prefer is welcome, I would be tempted to use a quantiles
based one.
- merge the 3 normalized chips (with the pack affy, use 'merge.AffyBatch')
and look at the distribution of the intensities (with the package affy,
something like 'hist(myaffybatch)' should do the job. I would advice to use
the parameter 'col' in the function call to color the densities according
to the time point they belong to. If you are lucky, the  "leftmost mode"
for each density will be at about the same location and you can go on.
If you are bit less lucky, you will have to use normalize with the method
"constant" to bring the "leftmost" modes at the same locations (use the
optional parameter 'FUN' to make a function that gets that mode for each
chip). This should make it.
(note: of course this will probably give you *a lot* of false positives.
A constrained non-linear tranformation would perform better.
I explored that a bit... I hope to put things together and come with
 some code for BioC.. sometimes... ). 

> 
> We then explored the possibility of using housekeeping genes for
> normalization. But it seems that the 100 housekeeping genes for HGU-133A
> are standard and not specified for our experiment. This is because only
> 28 of these 100 genes are expressed through out all time points.
> 

I remember having a real hard-time with house-keeping genes and cDNA.
I cannot tell with Affymetrix arrays, but I would be very careful using
them for normalization purposes.

> 
> The biologists have decided to re-do the experiment again and I think
> they are more likely to hear our advice BEFORE doing the experiments. My
> second question is this: Will 2-colour cDNA with UHR as reference
> overcome this problem ?
> 
> Now I would expect to see most of the un-expressed genes (and previously
> Absent in affy) to have very negative log ratio values. But I don't
> think the assumption of "most genes are not differentially expressed"
> will hold again. And how does one deal with this ... 

Give a go to the suggestion and tell us if it makes sense in your case.

> 
> My last question is has anyone done a comparison of Affymetrix to cDNA
> results/efficiency/advantages. I am interested in quantifying the
> benefits of spending 5 times as much money on something that has
> typically 40% absent calls. Thank you very much in advance.
> 
> Regards, Adai.
> 
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://www.stat.math.ethz.ch/mailman/listinfo/bioconductor

-- 
--------------------------------------------------------------
currently at the National Yang-Ming University in Taipei, Taiwan
--------------------------------------------------------------
Laurent Gautier			CBS, Building 208, DTU
PhD. Student			DK-2800 Lyngby,Denmark	
tel: +45 45 25 24 89		http://www.cbs.dtu.dk/laurent



More information about the Bioconductor mailing list