[BioC] Design and normalization of focused arrays

Gordon Smyth smyth at wehi.edu.au
Thu Sep 15 02:18:16 CEST 2005


A reference on the titration series have been added to the limma User's 
Guide at

   http://bioinf.wehi.edu.au/limma/usersguide.pdf

I worry about spike-in controls, such as Amersham's, for normalization 
because the spike-in mixtures need to be added separately to the 
experimental RNA samples, which means that the ratio of material in the 
test and reference spike-in mixtures cannot be completely controlled and 
will not in general be in exactly the same ratio as RNA in the Cy3 and Cy5 
experiment samples. In all examples I've seen, this has the effect of 
moving the spike-in calibration controls artificially up or down in the MA 
plot relative to the other spots. I find the titration series more reliable 
because it does not require any addition to your experimental samples.

For normalizing with a titration series, I like to use weight=1 for the 
titration series and a very small weight, say weight=0.01, for the other 
spots. This just avoids any unexpected numerical instability. An 
alternative is to use normalizeWithinArrays() with method="control" 
introduced in limma 2.0.6.

Gordon

At 08:00 PM 14/09/2005, bioconductor-request at stat.math.ethz.ch wrote:
>Date: Fri, 02 Sep 2005 15:02:01 +0200
>From: Klemens Vierlinger <klemens.vierlinger at arcs.ac.at>
>Subject: Re: [BioC] Design and normalization of focused arrays
>To: bioconductor at stat.math.ethz.ch
>Message-ID: <43184D49.3060408 at arcs.ac.at>
>Content-Type: text/plain; charset=ISO-8859-1; format=flowed
>
>Hi Vered,
>
>Our approach seems to be quite similar to yours, we have small arrays 
>(100-600 probes) of 65mer oligos. This is what we do:
>
>We have Amershams Scorecard controls on our arrays 
>(http://www5.amershambiosciences.com/aptrix/upp01077.nsf/Content/Products?OpenDocument&parentid=63004286&moduleid=165076). 
>This saves us the laborious process of cloning and ivt of bacterial genes. 
>However, the probes included in the set are cDNAs, so you will need to get 
>70mer probes synthesised. Amersham has licensed this out to a company 
>called tib-molbiol (www.tib-molbiol.de/). Unfortunately their pricing is 
>outrageous, but you cant buy them anywhere else (at least not as far as I 
>know, please let me know if you find them somewhere cheaper). They seem to 
>work well, in an experiment where we hybridised pbmc's against a tumor 
>cell line we found more of the expected genes differentially expressed 
>when normalising via those controls compared to normalising via 
>housekeeping genes. When we do a self-self hybridisation, the controls 
>behave much like the other spots on the array. This, to me, seems a good 
>indication that they do!
>   what they are supposed to do.
>
>However, via a QC experiment we found out that the control sequnces are 
>rather AT-rich. All the other probes on the array are designed to be 
>~50%GC (which I guess is what most people do), and I dont particularly 
>like the idea that the control sequences differ from the rest of the 
>array. This doesnt necessarily have to mean anything, but I am thinking of 
>adding another control kit, stratagenes spotreport 
>(http://www.stratagene.com/products/showCategory.aspx?catId=17) as a 
>control for the controls (am I paranoid or what :)). At the end of the day 
>there is no way you can find out where your baseline really is. All you 
>can do is try to make the least false assumtions!
>
>For normalisation I use the limma functions and the wheights as you 
>describe it.
>
>all the best
>Klemens
>
>
>
>
>Date: Thu, 1 Sep 2005 12:15:22 +0300
>From: "Vered Caspi" <veredcc at bgumail.bgu.ac.il>
>Subject: [BioC] Design and normalization of focused arrays
>To: <bioconductor at stat.math.ethz.ch>
>Message-ID: <001301c5aed5$ae54d770$32594884 at veredcc>
>Content-Type: text/plain
>
>Hello,
>
>I am currently designing a focused human long-oligo (70 mer) array of 150 
>genes, and I am wondering which control spots should be added to the array 
>to assist normalization, how many spots will be enough, and then how to do 
>the normalization.
>
>Here is some information I already gathered:
>
>In a paper by van de peppel et al from 2003 
>(http://www.nature.com/cgi-taf/DynaPage.taf?file=/embor/journal/v4/n4/full/embor798.html) 
>the authors used 9 bacterial control RNAs of different concentrations, and 
>printed at least 2 replicates on each array subgrid. Do you think 9 
>concentrations are enough?
>
>In the Limma UserGuide p.15 it is recommended, for focused arrays "to 
>include on the arrays a series of non-differentially expressed control 
>spots, such as a titration series of whole-library-pool spots, and to use 
>the up-weighting method". I don't understand what is the meaning of "a 
>titration series of whole-library-pool spots", and will appreciate any 
>further details or references on its preparation so I can deliver it to 
>the experimentalists. Also, will it be OK, in Limma, to give positive 
>weight to the control spots only, and weights of 0 to the 150 genes of study?
>
>I will appreciate any further ideas on which controls to include, how many 
>would be enough, and references, if available. Frankly, I am rather a 
>beginner with spotted arrays and with R, but so far used Limma 
>successfully for several spotted array analyses. Therefore, your advice on 
>normalization of these arrays will also be highly appreciated.
>
>With best regards,
>                              Vered
>______________________________________________________________
>Vered Caspi, Ph.D.
>Bioinformatics Support Unit, Head
>National Institute for Biotechnology in the Negev
>Building 39, room 214
>Ben-Gurion University of the Negev
>Beer-Sheva 84105, Israel
>
>veredcc bgumail.bgu.ac.il
>Tel: 08-6479034 054-7915969
>Fax: 08-6472983
>
>http://bioinfo.bgu.ac.il



More information about the Bioconductor mailing list