[BioC] GEO and Limma

Ovokeraye Achinike-Oduaran ovokeraye at gmail.com
Tue Oct 4 20:04:08 CEST 2011


Thanks a bunch, Johannes and Sean. I'll try the GSE data again as soon
as I  get in front of a comp. About the first question on creating a
model matrix  for a 3x2 factorial dataset like gds3715, any ideas?

Thanks again.

Avoks

On 10/4/11, Sean Davis <sdavis2 at mail.nih.gov> wrote:
> On Tue, Oct 4, 2011 at 10:28 AM, Ovokeraye Achinike-Oduaran
> <ovokeraye at gmail.com> wrote:
>> Hi all,
>>
>> I've been struggling a bit with GEO and limma. I've come up with 2
>> questions that I'ld appreciate some input on.
>>
>> Question 1:
>> I have a design matrix with 6columns for an expression set with 3
>> columns(3 levels, 2 agents). How do I create a model matrix to fit my
>> design matrix for this dataset (eg. gds3715 below)? For a straight
>> forward single factor analysis eg. gds161, the code below seems to
>> work fine.
>>
>> gds161dat = getGEO('GDS161',destdir=".")
>> gds161eset = GDS2eSet(gds161dat, do.log2=TRUE)
>> m = pData(gds161eset)$metabolism
>> design_gds161 = createDesignMatrix(gds161eset)
>> design_gds161 = model.matrix(~m)
>> fit = lmFit(gds161eset, design_gds161)
>> fit2 = eBayes(fit)
>> results = topTable(fit2, adjust ="BH", number = nrow(gds161eset))
>> excel = write.table(results, file = file.choose(), sep = ",")
>>
>> gds3715dat = getGEO('GDS3715',destdir=".")
>> gds3715eset = GDS2eSet(gds3715dat, do.log2=TRUE)
>> DIR = paste(pData(gds3715eset)$disease.state,
>> pData(gds3715eset)$agent, sep =".")
>> m = data.frame("fac1" = pData(gds3715eset)$disease.state, "fac2" =
>> pData(gds3715eset)$agent)
>> design_gds3715 = createDesignMatrix(gds3715eset)
>> design_gds3715 = model.matrix(~m)
>> fit = lmFit(gds3715eset, design_gds3715)
>> fit2 = eBayes(fit)
>> results = topTable(fit2, adjust ="BH", number = nrow(gds3715eset))
>> excel = write.table(results, file = file.choose(), sep = ",")
>>
>> Question 2:
>> How can I run a similar (limma) analysis with GSE files?
>> I can hardly figure out how to get from the expression set to the
>> analysis...I've read the Using the GEOquery Package documentation
>> several times over. I'm just have a hard time getting it to work.
>>
>> gse25724dat = getGEO('GSE25724', GSEMatrix = TRUE)
>
> Hi, Avoks.  The return value from this call is a LIST of
> ExpressionSets, not a single ExpressionSet.  You probably want the
> first ExpressionSet from the list and can get that:
>
> gse25724eSet = gse25724dat[[1]]
>
> GEOquery works this way because many GSEs (so-called superseries) have
> multiple series included within them, so returning a list is necessary
> to be general.
>
> Sean
>
>
>> This is supposed to give me an expressionset, I can't seem to figure
>> out how to analyze it with limma.
>>
>> Please help.
>>
>> Thanks.
>>
>> Avoks
>>
>> sessionInfo()
>> R version 2.13.1 (2011-07-08)
>> Platform: i386-pc-mingw32/i386 (32-bit)
>>
>> locale:
>> [1] LC_COLLATE=English_xxx. LC_CTYPE=English_xxx
>> [3] LC_MONETARY=English_xxx LC_NUMERIC=C
>> [5] LC_TIME=English_xxx
>>
>> attached base packages:
>> [1] stats     graphics  grDevices utils     datasets  methods   base
>>
>> other attached packages:
>> [1] puma_2.4.0      mclust_3.4.10   affy_1.30.0     limma_3.8.3
>> [5] GEOquery_2.19.4 Biobase_2.12.2
>>
>> loaded via a namespace (and not attached):
>> [1] affyio_1.20.0         preprocessCore_1.14.0 RCurl_1.6-10.1
>> [4] tools_2.13.1          XML_3.4-2.2
>>
>> _______________________________________________
>> Bioconductor mailing list
>> Bioconductor at r-project.org
>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>> Search the archives:
>> http://news.gmane.org/gmane.science.biology.informatics.conductor
>>
>



More information about the Bioconductor mailing list