[BioC] processing Illumina HT12v4.0 expression data from GEO

Abhishek Pratap abhishek.vit at gmail.com
Wed Jul 23 17:58:44 CEST 2014


Hi Sean

I did download and open the files under raw data and strangely enough
they have two files which seem like annotation for the Illumina
probes.

http://www.ncbi.nlm.nih.gov/geo/download/?acc=GSE58037&format=file&file=GSE58037%5Fmeningiomas%2Eraw%2Etxt%2Egz

-A

On Wed, Jul 23, 2014 at 8:52 AM, Sean Davis <sdavis2 at mail.nih.gov> wrote:
> Hi, Abhi.
>
> It looks like the GSE record contains raw data.  However, you may need to
> write to the authors to confirm what was done to create the .txt files that
> are present in the raw data archives.
>
> Sean
>
>
>
> On Wed, Jul 23, 2014 at 11:43 AM, Abhishek Pratap <abhishek.vit at gmail.com>
> wrote:
>>
>> Hi Sean
>>
>> Thanks for the details. I actually was wondering if I can get the raw
>> data so I can do my own normalization. For example in the case of affy
>> based GEO studies I normally see the raw CEL files also present which
>> can be used with fRMA for producing normalized data.
>>
>> In this specific study
>> http://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE58037 I dont see
>> the raw data similar to CEL files in affy studies. Just wondering if
>> this particular case where that is missing or beadarray based studies
>> dont tend to have raw data in GEO.
>>
>> Cheers!
>> -Abhi
>>
>> On Wed, Jul 23, 2014 at 3:40 AM, Sean Davis <sdavis2 at mail.nih.gov> wrote:
>> >
>> >
>> >
>> > On Thu, Jul 17, 2014 at 8:17 PM, Abhishek Pratap
>> > <abhishek.vit at gmail.com>
>> > wrote:
>> >>
>> >> Hi Guys
>> >>
>> >> I would like to know the basic analysis workflow for downloading and
>> >> processing a Illumina HTV12 expression data from GEO. I have seen the
>> >> beadArray vignette but not sure which normalization process to use.
>> >>
>> >> For example with Affy datasets I normally download the raw data and
>> >> normalize it with fRMA package to produce a final expression matrix of
>> >> genes.
>> >>
>> >> Here is some code but basically the final goal is to produce a
>> >> normalized  expression matrix at genelevel.
>> >>
>> >> library( GEOquery )
>> >> gse <- getGEO("GSE58037")
>> >> gse <- gse[[1]]
>> >> mat <- exprs(gse)
>> >>
>> >
>> > Hi, Abhi.
>> >
>> > The "mat" variable above will give you expression measures as submitted
>> > by
>> > the authors.  NCBI GEO provides a description:
>> >
>> > "The data were normalised using normal-exponential convolution
>> > model-based
>> > background correction and quantile normalization. Merging of the data,
>> > background removal and normalization processes were performed using the
>> > limma R package. All of the batches were normalized at once after
>> > excluding
>> > probes with low quality."
>> >
>> > If you do not want to use those normalized values, then you will need to
>> > define for yourself what the best approach is.  I don't know of an
>> > accepted
>> > "best" approach for such arrays.
>> >
>> > Sean
>> >
>> >
>> >>
>> >> Appreciate any pointers
>> >>
>> >> Thanks!
>> >> -Abhi
>> >>
>> >> _______________________________________________
>> >> Bioconductor mailing list
>> >> Bioconductor at r-project.org
>> >> https://stat.ethz.ch/mailman/listinfo/bioconductor
>> >> Search the archives:
>> >> http://news.gmane.org/gmane.science.biology.informatics.conductor
>> >
>> >
>>
>> _______________________________________________
>> Bioconductor mailing list
>> Bioconductor at r-project.org
>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>> Search the archives:
>> http://news.gmane.org/gmane.science.biology.informatics.conductor
>
>



More information about the Bioconductor mailing list