[BioC] how to convert a AffyBatch into ExpressionSet object ?

James W. MacDonald jmacdon at med.umich.edu
Thu Aug 23 17:08:45 CEST 2007



Jenny Drnevich wrote:
>>> and, why I converted the probe-level data into the probeset data was
>>> that I wanted to compare those data by looking directly in excel file.
>>> but I can't find how to write the excel file from probe-leve data of
>>> the AffyBatch Object. so I tried to do it.
>> What about the following ?
>> write.csv(exprs(abatch), file="myProbeLevelData.csv")
> 
> Be aware that you won't be able to use Excel to look at probe-level 
> data because Excel maxes out at ~65,000 rows (unless they've changed 
> something recently).

My understanding is that Excel 2007 has increased the number of rows, 
but I couldn't find anything with a google search to back that up.

> 
>>> and I wanted that it was possible to copy or convert from affybatch
>>> object to expression data with some modification directly but without
>>> normalization.
>> Did you try this ?
>> help("expresso", package="affy")
>>
> 
> There are also functions you can use to just do background correction 
> or normalization without summarization, if you want to see what 
> effect they have. See ?bg.adjust.gcrma for gc-based background 
> correction, ?bg.correct for other bg correction methods, and 
> ?normalize.methods for normalization options.  expresso() is nice in 
> that it lets you mix-and-match between a variety of different bg 
> corrections, normalizations and summarization methods, and rma() and 
> gcrma() have arguments to turn off the background correction and/or 
> normalization.
> 
> 
> Cheers,
> Jenny
> 
> 
>> Hoping this helps,
>>
>>
>>
>> Laurent
>>
>>
>>
>>> sincerely
>>> Minwook
>>>
>>> On 8/22/07, Jenny Drnevich <drnevich at uiuc.edu> wrote:
>>>> Hi Min Wook,
>>>>
>>>> The discrepancy is because you are comparing the probe-level data
>>>> from an AffyBatch object to the probeset-level data of an
>>>> ExpressionSet created by gcrma().  Why would you want to convert an
>>>> AffyBatch object directly into an ExpressionSet object without
>>>> summarizing the probe-level data into probeset data? The AffyBatch
>>>> object extends the ExpressionSet structure specifically for
>>>> probe-level data (individual PM and MM intensities), and typically
>>>> ExpressionSet objects are reserved for summarized probeset data. For
>>>> more details on the data classes, use ?"AffyBatch" and ?"ExpressionSet".
>>>>
>>>> Cheers,
>>>> Jenny
>>>>
>>>> At 11:42 AM 8/22/2007, Min Wook Kim wrote:
>>>>> Dear all,
>>>>> I tried to convert a object of AffyBatch into one of ExpressionSet.
>>>>> but I couldn't get exactly same information between them, actually,
>>>>> The data between original data and the modified data from gcrma was
>>>>> compared.
>>>>>
>>>>> The problem was that the assayData and fetureaData didn't match. Do I
>>>>> have to make new object of AssayData and featuredata by using "new
>>>>> command" ? are there any easy way ? e.g. some function to copy from
>>>>> one to the other.
>>>>>
>>>>> I did it like ;
>>>>>> abatch
>>>>> AffyBatch object
>>>>> size of arrays=1002x1002 features (8 kb)
>>>>> cdf=Mouse430_2 (45101 affyids)
>>>>> number of samples=4
>>>>> number of genes=45101
>>>>> annotation=mouse4302
>>>>> notes=
>>>>>> tmp <- new ("ExpressionSet", phenoData = phenoData(abatch) ,
>>>>> featureData = featureData(abatch), experimentData =
>>>>> experimentData(abatch), annotation =
>>>>> annotation(abatch),  assayData= assayData(abatch))
>>>>>
>>>>>
>>>>> And what's difference between the following statement and above which
>>>>> were different in assayData defined and exprs )
>>>>>
>>>>>> tmp <- new ("ExpressionSet", phenoData = phenoData(abatch) ,
>>>>> featureData = featureData(abatch), experimentData =
>>>>> experimentData(abatch), annotation = annotation(abatch),   exprs =
>>>>> exprs(abatch) )
>>>>>
>>>>> -------------------------------------------
>>>>> Finally, I want to make the same structure of the following two
>>>>> objects except the value depending on the effect of gcrma. myRMA was
>>>>> the output of gcrma. Maybe, my trying has a big misunderstanding of
>>>>> them. if do it, please tell me it.
>>>>>
>>>>>
>>>>>> myRMA
>>>>> ExpressionSet (storageMode: lockedEnvironment)
>>>>> assayData: 14707 features, 4 samples
>>>>>   element names: exprs
>>>>> phenoData
>>>>>   sampleNames: HM1_24, HM1_25, Flt3_a, Flt3_b
>>>>>   varLabels and varMetadata:
>>>>>     sample: arbitrary numbering
>>>>>     pheno1: arbitrary numbering
>>>>> featureData
>>>>>   rowNames: 1415670_at, 1415671_at, ..., AFFX-TransRecMur/X57349_3_at
>>>>> (14707 total)
>>>>>   varLabels and varMetadata: none
>>>>> experimentData: use 'experimentData(object)'
>>>>> Annotation [1] "mouse4302"
>>>>>> tmp
>>>>> ExpressionSet (storageMode: lockedEnvironment)
>>>>> assayData: 1004004 features, 4 samples
>>>>>   element names: exprs
>>>>> phenoData
>>>>>   sampleNames: HM1_24, HM1_25, Flt3_a, Flt3_b
>>>>>   varLabels and varMetadata:
>>>>>     sample: arbitrary numbering
>>>>>     pheno1: arbitrary numbering
>>>>> featureData
>>>>>   featureNames: 1, 2, ..., 1004004 (1004004 total)
>>>>>   varLabels and varMetadata: none
>>>>> experimentData: use 'experimentData(object)'
>>>>> Annotation [1] "mouse4302"
>>>>> --------------------------------------------------------------------
>>>>>
>>>>> And additionally, I haven't been able find the picture of description
>>>>> about the hierarchy of classes ; especially Affybatch , EspressionSet
>>>>> and eSet. If to exist, it would be so helpful.
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>> sessionInfo()
>>>>> R version 2.5.1 (2007-06-27)
>>>>> powerpc64-unknown-linux-gnu
>>>>>
>>>>> locale:
>>>>> LC_CTYPE=en_US.UTF-8;LC_NUMERIC=C;LC_TIME=en_US.UTF-8;LC_COLLATE 
>> =en_US.UTF-8;LC_MONETARY=en_US.UTF-8;LC_MESSAGES=en_US.UTF-8;LC_PAPER=en_US.UTF-8;LC_NAME=C;LC_ADDRESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=en_US.UTF-8;LC_IDENTIFICATION=C
>>>>> attached base packages:
>>>>>   [1] "splines"   "tools"     "stats"     "graphics"  "grDevices"
>>>> "datasets"
>>>>>   [7] "tcltk"     "utils"     "methods"   "base"
>>>>>
>>>>> other attached packages:
>>>>> mouse4302cdf          vsn       marray    tkWidgets      GOstats
>>>> Category
>>>>>     "1.16.0"      "2.2.0"     "1.14.0"     "1.14.0"      "2.2.6"
>>>> "2.2.3"
>>>>>       Matrix         RBGL        graph     multtest      annaffy
>>>>   KEGG
>>>>> "0.999375-1"     "1.12.0"     "1.14.2"     "1.16.1"      "1.8.1"
>>>> "1.16.1"
>>>>>           GO        limma affyQCReport  geneplotter      lattice
>>>> annotate
>>>>>     "1.16.0"     "2.10.5"     "1.14.0"     "1.14.0"    "0.15-11"
>>>> "1.14.1"
>>>>> RColorBrewer      affyPLM        gcrma  matchprobes     affydata
>>>> xtable
>>>>>      "1.0-1"     "1.12.0"      "2.8.1"      "1.8.1"     "1.11.3"
>>>> "1.5-1"
>>>>>   simpleaffy   genefilter     survival         affy       affyio
>>>> Biobase
>>>>>    "2.10.31"     "1.14.1"       "2.32"     "1.14.2"      "1.4.1"
>>>> "1.14.1"
>>>>>       DynDoc  widgetTools
>>>>>     "1.14.0"     "1.12.0"
>>>>>
>>>>> _______________________________________________
>>>>> Bioconductor mailing list
>>>>> Bioconductor at stat.math.ethz.ch
>>>>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>>>>> Search the archives:
>>>>> http://news.gmane.org/gmane.science.biology.informatics.conductor
>>>> Jenny Drnevich, Ph.D.
>>>>
>>>> Functional Genomics Bioinformatics Specialist
>>>> W.M. Keck Center for Comparative and Functional Genomics
>>>> Roy J. Carver Biotechnology Center
>>>> University of Illinois, Urbana-Champaign
>>>>
>>>> 330 ERML
>>>> 1201 W. Gregory Dr.
>>>> Urbana, IL 61801
>>>> USA
>>>>
>>>> ph: 217-244-7355
>>>> fax: 217-265-5066
>>>> e-mail: drnevich at uiuc.edu
>>>>
>>>>
>>> _______________________________________________
>>> Bioconductor mailing list
>>> Bioconductor at stat.math.ethz.ch
>>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>>> Search the archives:
>>> http://news.gmane.org/gmane.science.biology.informatics.conductor
>>>
> 
> Jenny Drnevich, Ph.D.
> 
> Functional Genomics Bioinformatics Specialist
> W.M. Keck Center for Comparative and Functional Genomics
> Roy J. Carver Biotechnology Center
> University of Illinois, Urbana-Champaign
> 
> 330 ERML
> 1201 W. Gregory Dr.
> Urbana, IL 61801
> USA
> 
> ph: 217-244-7355
> fax: 217-265-5066
> e-mail: drnevich at uiuc.edu
> 
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor

-- 
James W. MacDonald, M.S.
Biostatistician
Affymetrix and cDNA Microarray Core
University of Michigan Cancer Center
1500 E. Medical Center Drive
7410 CCGC
Ann Arbor MI 48109
734-647-5623



More information about the Bioconductor mailing list