[BioC] vsn preprocess with oligo package

James W. MacDonald jmacdon at uw.edu
Fri Jan 11 17:07:42 CET 2013


Hi Jose,

On 1/11/2013 10:31 AM, José LÓPEZ wrote:
> Dear Jim,
>
> Thank you for the advise on the background correction step.
> I already tryed before the whole Benilton's code but it doesn't work at the following step.
>
>> pm(raw)<- pmVSN
> Error: object 'pmVSN' not found
>
> May I ask you what this step is doing? Does it replace the pm matrix in the raw ExpressionFeatureSet by the normalized one?

Exactly. But the 'pm <-'  function expects to be fed a matrix, and the 
pmsVSN isn't a matrix. Instead, it is a 'vsn' object, which is related 
to an ExpressionSet object. So you can extract the matrix of normalized 
data as usual, with exprs():

pm(raw) <- exprs(pmsVSN)

and there was another error in my code, as summarize() won't work on a 
GeneFeatureSet object. Instead, you want to use rma():

eset <- rma(raw, normalize = FALSE, background = FALSE)

Best,

Jim



>
> Thank you in advance for your time and your kind help,
>
> Jose
>
>> library(limma)
>> library(oligo)
>> Data=read.celfiles(list.celfiles())
> Loading required package: pd.mogene.1.0.st.v1
> Loading required package: RSQLite
> Loading required package: DBI
> Platform design info loaded.
> Reading in : ABRNA1.CEL
> Reading in : ABRNA2.CEL
> Reading in : ABRNA3.CEL
> Reading in : ABRNA4.CEL
> Reading in : ABRNA5.CEL
> Reading in : ABRNA6.CEL
>> pms=pm(Data)
>> raw=backgroundCorrect(Data,"rma")
> Background correcting... OK
>> pms=pm(raw)
>> pmsVSN=vsn::vsnMatrix(pms)
> vsn2: 899636 x 6 matrix (1 stratum). Please use 'meanSdPlot' to verify the fit.
>> pm(raw)<- pmVSN
> Error: object 'pmVSN' not found
>> pm(raw)<- pmsVSN
> Error in function (classes, fdef, mtable)  :
>    unable to find an inherited method for function ‘pm<-’ for signature ‘"GeneFeatureSet", "missing", "missing", "vsn"’
>> pmsVSN
> vsn object for n=899636 features and d=6 samples.
> sigsq=0.1
> hx: 899636 x 6 matrix.
>> head(pms)
>       ABRNA1.CEL ABRNA2.CEL ABRNA3.CEL ABRNA4.CEL ABRNA5.CEL ABRNA6.CEL
> 2106   7.853059   7.635413   5.221570   5.160448   5.978796   5.767533
> 2107   5.990083   4.764031   4.659925   5.160448   7.067603   5.903628
> 2108   4.867173   4.587342   5.023095   4.762256   5.978796   6.045105
> 2109  10.657556   8.827158   7.679404   6.124607  11.523816   6.045105
> 2110  12.538396  13.906774   6.147958   5.860049  38.764281   6.843026
> 2111  10.241785   6.356910   4.836138   5.160448   7.067603   7.405633
>> class(pms)
> [1] "matrix"
>> ls()
> [1] "Data"   "pms"    "pmsVSN" "raw"
>
>> sessionInfo()
> R version 2.15.2 (2012-10-26)
> Platform: x86_64-apple-darwin9.8.0/x86_64 (64-bit)
>
> locale:
> [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
>
> attached base packages:
> [1] stats     graphics  grDevices utils     datasets  methods   base
>
> other attached packages:
> [1] pd.mogene.1.0.st.v1_3.8.0 RSQLite_0.11.2            DBI_0.2-5                 oligo_1.22.0
> [5] Biobase_2.18.0            oligoClasses_1.20.0       BiocGenerics_0.4.0        limma_3.14.3
>
> loaded via a namespace (and not attached):
>   [1] affxparser_1.30.0     affy_1.36.0           affyio_1.26.0         BiocInstaller_1.8.3   Biostrings_2.26.2
>   [6] bit_1.1-9             codetools_0.2-8       ff_2.2-10             foreach_1.4.0         GenomicRanges_1.10.5
> [11] grid_2.15.2           IRanges_1.16.4        iterators_1.0.6       lattice_0.20-10       parallel_2.15.2
> [16] preprocessCore_1.20.0 splines_2.15.2        stats4_2.15.2         vsn_3.26.0            zlibbioc_1.4.0
>
>
> *****************************************************
> José P. LÓPEZ-ATAYALA
> Instituto de Neurociencias
> CSIC - UMH
> Avda. D. Santiago Ramón y Cajal, S/N
> E-03550, Sant Joan d'Alacant
> Alicante, Spain
> jose.lopez at umh.es
> http://in.umh.es/grupos-detalle.aspx?grupo=30
> (34) 965 919 531
>
> El ene 11, 2013, a las 3:59 p.m., James W. MacDonald escribió:
>
>> Hi Jose,
>>
>> Let's say you followed Benilton's code from https://stat.ethz.ch/pipermail/bioconductor/2010-June/033936.html
>>
>> library(oligo)
>> cels = list.celfiles()
>> raw = read.celfiles(cels)
>> raw = backgroundCorrect(raw, "rma") ## I added this - you might want BG correction
>> pms = pm(raw)
>> pmsVSN = vsn::vsnMatrix(pms)
>> pm(raw)<- pmVSN
>> rm(pms, pmsVSN)
>>
>> you now have an ExpressionFeatureSet with normalized data that you want to summarize. You can then
>>
>> eset<- summarize(raw, method = "medianpolish")
>>
>> See
>>
>> ?summarizationMethods
>>
>> for more information.
>>
>> Best,
>>
>> Jim
>>
>>
>> On 1/11/2013 6:17 AM, José LÓPEZ wrote:
>>> Dear Benilton,
>>>
>>> I am using oligo for Mouse Gene 1.0ST arrays. In addition to RMA, I would also like to pre-process with vsn. I have seen previous threads related to this question in the past, (https://stat.ethz.ch/pipermail/bioconductor/2010-January/031100.html, https://stat.ethz.ch/pipermail/bioconductor/2010-June/033936.html), but unfortunately, I am not bioinformatician and, although I read oligo and vsn manuals, it is not easy to me to follow up to summarize the vsn object.
>>> May you (or someone else) please, give me some additional clue to sumarize the vsn object using the oligo package.
>>>
>>> Thank you very much in advance for your time and your kind help,
>>>
>>>   Jose LOPEZ
>>>
>>> **************************
>>>
>>>> list.files()
>>> [1] "ABRNA1.CEL"                          "ABRNA2.CEL"                          "ABRNA3.CEL"
>>> [4] "ABRNA4.CEL"                          "ABRNA5.CEL"                          "ABRNA6.CEL"
>>> [7] "Limma_FilterBefore_H2BGFP_jla_vsn.R"
>>>> library(limma)
>>>> library(oligo)
>>> Loading required package: BiocGenerics
>>>
>>> Attaching package: ‘BiocGenerics’
>>>
>>> The following object(s) are masked from ‘package:stats’:
>>>
>>>      xtabs
>>>
>>> The following object(s) are masked from ‘package:base’:
>>>
>>>      anyDuplicated, cbind, colnames, duplicated, eval, Filter, Find, get, intersect, lapply, Map, mapply,
>>>      mget, order, paste, pmax, pmax.int, pmin, pmin.int, Position, rbind, Reduce, rep.int, rownames, sapply,
>>>      setdiff, table, tapply, union, unique
>>>
>>> Loading required package: oligoClasses
>>> Loading package bit 1.1-9
>>> package:bit (c) 2008-2012 Jens Oehlschlaegel (GPL-2)
>>> creators: bit bitwhich
>>> coercion: as.logical as.integer as.bit as.bitwhich which
>>> operator: !&   | xor != ==
>>> querying: print length any all min max range sum summary
>>> bit access: length<- [ [<- [[ [[<-
>>> for more help type ?bit
>>> Loading package ff2.2-10
>>> - getOption("fftempdir")=="/var/folders/U+/U+SFMmqcEbKkSysJQ3OYbk+++TQ/-Tmp-//RtmpKgQzWD"
>>>
>>> - getOption("ffextension")=="ff"
>>>
>>> - getOption("ffdrop")==TRUE
>>>
>>> - getOption("fffinonexit")==TRUE
>>>
>>> - getOption("ffpagesize")==65536
>>>
>>> - getOption("ffcaching")=="mmnoflush"  -- consider "ffeachflush" if your system stalls on large writes
>>>
>>> - getOption("ffbatchbytes")==16777216 -- consider a different value for tuning your system
>>>
>>> - getOption("ffmaxbytes")==536870912 -- consider a different value for tuning your system
>>>
>>> Welcome to oligoClasses version 1.20.0
>>> Loading required package: Biobase
>>> Welcome to Bioconductor
>>>
>>>      Vignettes contain introductory material; view with 'browseVignettes()'. To cite Bioconductor, see
>>>      'citation("Biobase")', and for packages 'citation("pkgname")'.
>>>
>>> ========================================================================================================================
>>> Welcome to oligo version 1.22.0
>>> ========================================================================================================================
>>>
>>> Attaching package: ‘oligo’
>>>
>>> The following object(s) are masked from ‘package:limma’:
>>>
>>>      backgroundCorrect
>>>
>>>> Data=read.celfiles(list.celfiles())
>>> Loading required package: pd.mogene.1.0.st.v1
>>> Loading required package: RSQLite
>>> Loading required package: DBI
>>> Platform design info loaded.
>>> Reading in : ABRNA1.CEL
>>> Reading in : ABRNA2.CEL
>>> Reading in : ABRNA3.CEL
>>> Reading in : ABRNA4.CEL
>>> Reading in : ABRNA5.CEL
>>> Reading in : ABRNA6.CEL
>>>> pms=pm(Data)
>>>> class(pms)
>>> [1] "matrix"
>>>> head(pms)
>>>       ABRNA1.CEL ABRNA2.CEL ABRNA3.CEL ABRNA4.CEL ABRNA5.CEL ABRNA6.CEL
>>> 2106         46         43         36         36         37         33
>>> 2107         38         32         33         36         43         34
>>> 2108         31         31         35         34         37         35
>>> 2109         54         46         45         40         58         35
>>> 2110         58         55         40         39         94         40
>>> 2111         53         39         34         36         43         43
>>>> pmsVSN=vsn::vsnMatrix(pms)
>>> vsn2: 899636 x 6 matrix (1 stratum). Please use 'meanSdPlot' to verify the fit.
>>>> class(pmsVSN)
>>> [1] "vsn"
>>> attr(,"package")
>>> [1] "vsn"
>>>> pmsVSN
>>> vsn object for n=899636 features and d=6 samples.
>>> sigsq=0.026
>>> hx: 899636 x 6 matrix.
>>>> eset=rma(pmsVSN, background=FALSE,normalize=FALSE)
>>> Error in function (classes, fdef, mtable)  :
>>>    unable to find an inherited method for function ‘rma’ for signature ‘"vsn"’
>>>
>>>> library(vsn)
>>>> meanSdPlot(pmsVSN)
>>> KernSmooth 2.23 loaded
>>> Copyright M. P. Wand 1997-2009
>>>
>>>> sessionInfo()
>>> R version 2.15.2 (2012-10-26)
>>> Platform: i386-apple-darwin9.8.0/i386 (32-bit)
>>>
>>> locale:
>>> [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
>>>
>>> attached base packages:
>>> [1] stats     graphics  grDevices utils     datasets  methods   base
>>>
>>> other attached packages:
>>> [1] vsn_3.26.0                pd.mogene.1.0.st.v1_3.8.0 RSQLite_0.11.2            DBI_0.2-5
>>> [5] oligo_1.22.0              Biobase_2.18.0            oligoClasses_1.20.0       BiocGenerics_0.4.0
>>> [9] limma_3.14.3
>>>
>>> loaded via a namespace (and not attached):
>>>   [1] affxparser_1.30.0     affy_1.36.0           affyio_1.26.0         BiocInstaller_1.8.3   Biostrings_2.26.2
>>>   [6] bit_1.1-9             codetools_0.2-8       ff_2.2-10             foreach_1.4.0         GenomicRanges_1.10.5
>>> [11] grid_2.15.2           IRanges_1.16.4        iterators_1.0.6       KernSmooth_2.23-8     lattice_0.20-10
>>> [16] parallel_2.15.2       preprocessCore_1.20.0 splines_2.15.2        stats4_2.15.2         tools_2.15.2
>>> [21] zlibbioc_1.4.0
>>>
>>>
>>> 	[[alternative HTML version deleted]]
>>>
>>>
>>>
>>> _______________________________________________
>>> Bioconductor mailing list
>>> Bioconductor at r-project.org
>>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>>> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
>> -- 
>> James W. MacDonald, M.S.
>> Biostatistician
>> University of Washington
>> Environmental and Occupational Health Sciences
>> 4225 Roosevelt Way NE, # 100
>> Seattle WA 98105-6099
>>

-- 
James W. MacDonald, M.S.
Biostatistician
University of Washington
Environmental and Occupational Health Sciences
4225 Roosevelt Way NE, # 100
Seattle WA 98105-6099



More information about the Bioconductor mailing list