[BioC] getGEO - getting the .CEL files from GEO

Vincent Carey stvjc at channing.harvard.edu
Wed Mar 17 17:03:38 CET 2010


do you really want to put sample-characteristics data in a CEL file?

the sample characteristics are available as follows:

 ff = getGEO("GSE4045")

> table(pData(ff[[1]])$descr)

        conventional colorectal tumor, mucinous, Dukes Stage c, MSS,
no cancer in the family, male, Distal Location , Tumor Grade 2

                                                           1
  conventional colorectal tumor, non-mucinous, Dukes Stage b, MSS, no
cancer in the family, female, Distal Location , Tumor Grade 2

                                                           1
conventional colorectal tumor, non-mucinous, Dukes Stage c, MSI, no
cancer in the family, female, Proximal Location , Tumor Grade 3

                                                           1
....

and you will have to parse that 'description' field to extract stage
and other relevant information.  for example

de = as.character(ff[[1]]$desc
gr = gsub(".*, Tumor Grade.(.)$", "\\1", de)

gives you a single character string for grade, except for sample 14 --
where my regexp doesn't do as much as it should.

such activities would be used to populate an annotated data frame
which could then serve as the phenoData component of an AffyBatch
instance, which is a typical container for CEL-based intensity data,
to be propagated downstream through background correction and
normalization and so forth.  The experimentData element should also be
suitably populated, as early in the workflow as possible.  If we look
closely enough we can find that the ExpressionSet returned by getGEO
has quantifications generated by MAS 5.0.

On Wed, Mar 17, 2010 at 11:27 AM, 張 語恬 <greengarden_0925 at hotmail.com> wrote:
>
>
> Hi:
>
> I've download  the GSE CEL files from GEO. But I have trouble in adding the individual charateristics, such as tumor site, age, gender...and so on, to the CEL file.
>
> I've read the mail of [BioC] getGEO - getting the .CEL files from GEO,but still not understood.
>
> Could you use GSE4045 as an example to demonstrate
> how to use the exprs(), I can find the instrucion in the mailing list, to replace the GSE4045.SOFT  with the CEL raw microarray data and keep the characteristics left.
>
> Thanks,
> greengarden
> _________________________________________________________________
> Hotmail 強大的垃圾信件管理功能,值得你信賴。
>
>        [[alternative HTML version deleted]]
>
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
>



More information about the Bioconductor mailing list