[BioC] Using limma to analyze GEO datasets/series from two-channel experiments

Gordon K Smyth smyth at wehi.EDU.AU
Mon Oct 5 00:31:19 CEST 2009


Dear Ana,

Your code should work already.  Analysing log-ratios is the normal way to 
handle two-colour arrays.  Please see the User's Guide which has plenty of 
examples.

Best wishes
Gordon

> Date: Fri, 2 Oct 2009 15:00:08 -0700
> From: Ana Rodrigues <arodrigues at salk.edu>
> Subject: [BioC] Using limma to analyze GEO datasets/series from
> 	two-channel	experiments
> To: bioconductor at stat.math.ethz.ch
> Content-Type: text/plain; charset=UTF-8
>
> Dear Gurus,
> I am attempting to analyze a bunch of microarray experiments from the
> GEO database.
>
> I have experience with Affymetrix chips, so it was reasonably simple
> to download the datasets/series of interest, retrieve the relevant
> columns from the GSM files (figure out whether they were normalized,
> logged, etc), and perform the comparisons I need using limma.
>
> Now I am struggling to do the same for other platforms, in particular,
> two-color platforms.
> The first few such experiments I have looked at look reasonably
> simple.  However, I aven't been able to figure out how to obtain a
> data structure that lmFit can use from the GSM files.
>
> I decided to try the GEOquery package to interface with GEO.
>
> gse <- getGEO("GSE2998")
> exprs <- exprs(gse[[1]])
>
> The exprs matrix now contains the VALUE column from each GSM file,
> which in this particular case is "The log2-transformed ratio of the
> Lowess-normalized fluorescence values (Ch2/Ch1) exported from
> GeneTraffic".
>
> For one of the comparisons that I am interested in, there are two
> chips of relevance.
> GSM65523, with treated Cy3 and untreated Cy5
> GSM65567, with treated Cy3 and untreated Cy5
>
> I thought that the best way to compare treated to untreated would be
> something like:
> targets <- matrix(c("GSM65523", "noHS", "HS",
>                    "GSM65567", "noHS", "HS"), ncol=3, byrow=TRUE,
>                  dimnames=list(NULL, c("SlideNumber", "Cy3", "Cy5")))
> design <- modelMatrix(targets, ref="noHS")
> lmFit(exprs, design)
>
> But, of course, exprs doesn't contain any channel info, just the log
> ratio between the channels.
> Should I be retrieving different columns from the GSM files? How can I
> build a data structure from that data that lmFit can use? Is there a
> better way to do simple comparisons of two-channel GEO datasets?
>
> Thank you so much for any help you can provide!
> Best,
> Ana



More information about the Bioconductor mailing list