[BioC] justRMA?

James W. MacDonald jmacdon at uw.edu
Tue Dec 11 21:18:11 CET 2012


Hi Bhargavi,

By sessionInfo(), I meant for you to type

sessionInfo()

at an R prompt, which will say which version of packages you have.

Anyway, the problem lies with the primeviewhsrefseqcdf package you are 
using. When you run length(ls(primeviewhsrefseqcdf)) it tells you how 
many probesets that cdf package recognizes, and that is why you get a 
different number on Linux. If I do the same on Windows I get this:

 > length(ls(primeviewhsrefseqcdf))
[1] 34901

And then if I do that on Linux, I get

 > length(ls(primeviewhsrefseqcdf))
[1] 34901

Which obviously agrees. Then if I show my sessionInfo(), you can see the 
versions I am using and what you need to upgrade to.

 > sessionInfo()
R version 2.15.1 (2012-06-22)
Platform: x86_64-unknown-linux-gnu (64-bit)

locale:
  [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C
  [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8
  [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8
  [7] LC_PAPER=C                 LC_NAME=C
  [9] LC_ADDRESS=C               LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base

other attached packages:
[1] primeviewhsrefseqcdf_16.0.0 AnnotationDbi_1.20.3
[3] Biobase_2.18.0              BiocGenerics_0.4.0

loaded via a namespace (and not attached):
[1] DBI_0.2-5       IRanges_1.16.4  parallel_2.15.1 RSQLite_0.11.2
[5] stats4_2.15.1   tools_2.15.1


Best,

Jim





On 12/11/2012 3:09 PM, Bhargavi Duvvuri wrote:
> Hello James,
>
> Here is the session info after I run justRMA.
>
> R version 2.15.0 (2012-03-30)
> Copyright (C) 2012 The R Foundation for Statistical Computing
> ISBN 3-900051-07-0
> Platform: x86_64-unknown-linux-gnu (64-bit)
>
> R is free software and comes with ABSOLUTELY NO WARRANTY.
> You are welcome to redistribute it under certain conditions.
> Type 'license()' or 'licence()' for distribution details.
>
> R is a collaborative project with many contributors.
> Type 'contributors()' for more information and
> 'citation()' on how to cite R or R packages in publications.
>
> Type 'demo()' for some demos, 'help()' for on-line help, or
> 'help.start()' for an HTML browser interface to help.
> Type 'q()' to quit R.
>
> [Previously saved workspace restored]
>
> > library(affy)
> Loading required package: BiocGenerics
>
> Attaching package: 'BiocGenerics'
>
> The following object(s) are masked from 'package:stats':
>
>     xtabs
>
> The following object(s) are masked from 'package:base':
>
>     Filter, Find, Map, Position, Reduce, anyDuplicated, cbind,
>     colnames, duplicated, eval, get, intersect, lapply, mapply, mget,
>     order, paste, pmax, pmax.int <http://pmax.int>, pmin, pmin.int 
> <http://pmin.int>, rbind, rep.int <http://rep.int>,
>     rownames, sapply, setdiff, table, tapply, union, unique
>
> Loading required package: Biobase
> Welcome to Bioconductor
>
>     Vignettes contain introductory material; view with
>     'browseVignettes()'. To cite Bioconductor, see
>     'citation("Biobase")', and for packages 'citation("pkgname")'.
>
> >    setwd('/home/run/testcel')
> >     expr.vals <- justRMA(cdfname = "PrimeViewHsREFSEQcdf")
> Loading required package: AnnotationDbi
>
> >     write.exprs(expr.vals, file= 'Customoutput.csv', sep=",", 
> row.names=F, col.names=T, quote=F)
> >
> > proc.time()
>    user  system elapsed
> 399.416  21.025 440.413
>
> With length(ls(primeviewhsrefseqcdf)) I get:
>
>  length(ls(primeviewhsrefseqcdf))
> [1] 20367
>
>
> Thank you
>
> On Tue, Dec 11, 2012 at 12:44 PM, James W. MacDonald <jmacdon at uw.edu 
> <mailto:jmacdon at uw.edu>> wrote:
>
>     Hi Bhargavi,
>
>     Please don't take things off-list.
>
>
>     On 12/11/2012 12:05 PM, Bhargavi Duvvuri wrote:
>
>         Hello James,
>
>         Thank you for your reply.
>
>         Here is the new issue I get with RMA:
>
>         When I run on Windows operating system I get 34901 probe
>         intensities with one or many CEL  files. Same output with
>         justRMA, as you said.
>
>         However, when I run same on Mac OS and Linux, I would get
>         23068 probe intensities with one or many CEL files. Same
>         output with justRMA.
>
>         Same version of R and affy on three operating systems.
>
>
>     What is your output from sessionInfo() on Linux? Make sure you
>     have loaded the primeviewhsrefseqcdf package first.
>
>     Also, what do you get from
>
>     length(ls(primeviewhsrefseqcdf))
>
>     on Linux?
>
>     Best,
>
>     Jim
>
>
>
>         Why would this happen? I can run batch of CEL files only on
>         Mac or Linux due to memory issues.
>
>         Could you please advise?
>
>         Thank you
>
>         Bhargavi
>
>
>
>         On Mon, Dec 10, 2012 at 9:26 AM, James W. MacDonald
>         <jmacdon at uw.edu <mailto:jmacdon at uw.edu> <mailto:jmacdon at uw.edu
>         <mailto:jmacdon at uw.edu>>> wrote:
>
>             Hi Bhargavi,
>
>
>             On 12/9/2012 1:38 AM, Bhargavi Duvvuri wrote:
>
>                 Hello James,
>
>                 I have tested as you mentioned below:
>
>         > nrow(exprs(justRMA(cdfname = "PrimeViewHsREFSEQcdf")))
>                 Loading required package: AnnotationDbi
>
>                 [1] 34901
>
>                 This number is exactly the same as in the CDF file and
>         matches
>                 with that of RMA.
>
>                 I get 34901 rows when I process single CEL file.  However,
>                 when I do justRMA with 350 CEL files, number of rows
>         in the
>                 output are 23068. Why would this happen?
>
>
>             It shouldn't. I don't have 350 of the same type of celfile
>         to test
>             this with, so I would suggest trying a subset (like 100)
>         and see
>             how that goes. You might be getting an error that you
>         didn't notice.
>
>             Best,
>
>             Jim
>
>
>
>                 Please advise.
>
>                 Thank you
>
>                 Bhargavi
>
>                 On Fri, Dec 7, 2012 at 4:42 PM, James W. MacDonald
>         <jmacdon at uw.edu <mailto:jmacdon at uw.edu> <mailto:jmacdon at uw.edu
>         <mailto:jmacdon at uw.edu>> <mailto:jmacdon at uw.edu
>         <mailto:jmacdon at uw.edu>
>
>         <mailto:jmacdon at uw.edu <mailto:jmacdon at uw.edu>>>> wrote:
>
>                     Hi Bargavi,
>
>
>                     On 12/7/2012 4:08 PM, Bhargavi Duvvuri wrote:
>
>                         Hello,
>
>                         I am using justRMA for processing 350 CEL
>         files. When
>                 I does
>                         with RMA I get
>                         34901 probe intensities  which match the CDF file.
>                 However,
>                         when I run
>                         justRMA, I get only 23068 probe intensities. I
>         am not
>                 sure why
>                         would this
>                         happen?
>
>
>                     It shouldn't, and we have done extensive testing
>         to ensure
>                 that
>                     the results from justRMA() are identical to rma().
>         As an
>                 example:
>
>         > nrow(exprs(rma(ReadAffy(cdfname = "mogene10stmmrefseqcdf"))))
>                     Background correcting
>                     Normalizing
>                     Calculating Expression
>                     [1] 28312
>         > nrow(exprs(justRMA(cdfname = "mogene10stmmrefseqcdf")))
>                     [1] 28312
>
>                     So I get the same number of rows using both
>         methods. Now check
>                     that I get the same exact values:
>
>         > all.equal(exprs(rma(ReadAffy(cdfname =
>                     "mogene10stmmrefseqcdf"))),exprs(justRMA(cdfname =
>                     "mogene10stmmrefseqcdf")))
>                     Background correcting
>                     Normalizing
>                     Calculating Expression
>                     [1] TRUE
>
>                     So it looks OK to me.
>
>                     Best,
>
>                     Jim
>
>         > sessionInfo()
>                     R version 2.15.1 (2012-06-22)
>                     Platform: x86_64-unknown-linux-gnu (64-bit)
>
>                     locale:
>                      [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C
>                      [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8
>                      [5] LC_MONETARY=en_US.UTF-8  
>          LC_MESSAGES=en_US.UTF-8
>                      [7] LC_PAPER=C                 LC_NAME=C
>                      [9] LC_ADDRESS=C               LC_TELEPHONE=C
>                     [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
>
>                     attached base packages:
>                     [1] stats     graphics  grDevices utils     datasets
>                  methods   base
>
>                     other attached packages:
>                     [1] mogene10stmmrefseqcdf_16.0.0
>         mogene11stmmrefseqcdf_16.0.0
>                     [3] mogene10stv1cdf_2.11.0       AnnotationDbi_1.20.3
>                     [5] affy_1.36.0                  Biobase_2.18.0
>                     [7] BiocGenerics_0.4.0
>
>                     loaded via a namespace (and not attached):
>                      [1] affyio_1.26.0         BiocInstaller_1.8.3  
>         DBI_0.2-5
>                      [4] IRanges_1.16.4        parallel_2.15.1        
>              preprocessCore_1.20.0
>                      [7] RSQLite_0.11.2        stats4_2.15.1        
>         tools_2.15.1
>                     [10] zlibbioc_1.4.0
>         >
>
>
>
>                         Below is the code I am using:
>
>                         library(affy)
>                             setwd('/home/run/testcel')
>                              expr.vals<- justRMA(cdfname =
>         "primeviewhsrefseqcdf")
>                              write.exprs(expr.vals, file=
>         'Customoutput.csv',
>                 sep=",",
>                         row.names=F,
>                         col.names=T, quote=F)
>
>                         Could you please advise me here on how to
>         modify the
>                 script so
>                         that I
>                         get normalized probe intensities for all the
>          probe sets?
>
>                         Thank you for your attention and time.
>
>                         Bhargavi
>
>                                 [[alternative HTML version deleted]]
>
>                         _______________________________________________
>                         Bioconductor mailing list
>         Bioconductor at r-project.org <mailto:Bioconductor at r-project.org>
>         <mailto:Bioconductor at r-project.org
>         <mailto:Bioconductor at r-project.org>>
>         <mailto:Bioconductor at r-project.org
>         <mailto:Bioconductor at r-project.org>
>
>         <mailto:Bioconductor at r-project.org
>         <mailto:Bioconductor at r-project.org>>>
>
>         https://stat.ethz.ch/mailman/listinfo/bioconductor
>                         Search the archives:
>         http://news.gmane.org/gmane.science.biology.informatics.conductor
>
>
>                     --     James W. MacDonald, M.S.
>                     Biostatistician
>                     University of Washington
>                     Environmental and Occupational Health Sciences
>                     4225 Roosevelt Way NE, # 100
>                     Seattle WA 98105-6099
>
>
>
>
>                 --         Bhargavi Duvvuri M.Sc, Ph.D
>                 TAS/CAN Postdoctoral Research Fellow
>                 The Hospital for Sick Children Research Institute
>                 Division of Cell Biology
>                 MARS Centre - Toronto Medical Discovery Tower
>                 101 College street, Rm 12-401 Bay C
>                 Toronto, ON, Canada M5G 1L7
>                 Phone: 416-813-7780 <tel:416-813-7780>
>         <tel:416-813-7780 <tel:416-813-7780>>
>                 Fax: 416-813-8883 <tel:416-813-8883> <tel:416-813-8883
>         <tel:416-813-8883>>
>                 email: bhargavi.duvvuri at sickkids.ca
>         <mailto:bhargavi.duvvuri at sickkids.ca>
>         <mailto:bhargavi.duvvuri at sickkids.ca
>         <mailto:bhargavi.duvvuri at sickkids.ca>>
>         <mailto:bhargavi.duvvuri at sickkids.ca
>         <mailto:bhargavi.duvvuri at sickkids.ca>
>
>         <mailto:bhargavi.duvvuri at sickkids.ca
>         <mailto:bhargavi.duvvuri at sickkids.ca>>>
>
>
>             --     James W. MacDonald, M.S.
>             Biostatistician
>             University of Washington
>             Environmental and Occupational Health Sciences
>             4225 Roosevelt Way NE, # 100
>             Seattle WA 98105-6099
>
>
>
>
>         -- 
>         Bhargavi Duvvuri M.Sc, Ph.D
>         TAS/CAN Postdoctoral Research Fellow
>         The Hospital for Sick Children Research Institute
>         Division of Cell Biology
>         MARS Centre - Toronto Medical Discovery Tower
>         101 College street, Rm 12-401 Bay C
>         Toronto, ON, Canada M5G 1L7
>         Phone: 416-813-7780 <tel:416-813-7780>
>         Fax: 416-813-8883 <tel:416-813-8883>
>         email: bhargavi.duvvuri at sickkids.ca
>         <mailto:bhargavi.duvvuri at sickkids.ca>
>         <mailto:bhargavi.duvvuri at sickkids.ca
>         <mailto:bhargavi.duvvuri at sickkids.ca>>
>
>
>     -- 
>     James W. MacDonald, M.S.
>     Biostatistician
>     University of Washington
>     Environmental and Occupational Health Sciences
>     4225 Roosevelt Way NE, # 100
>     Seattle WA 98105-6099
>
>
>
>
> -- 
> Bhargavi Duvvuri M.Sc, Ph.D
> TAS/CAN Postdoctoral Research Fellow
> The Hospital for Sick Children Research Institute
> Division of Cell Biology
> MARS Centre - Toronto Medical Discovery Tower
> 101 College street, Rm 12-401 Bay C
> Toronto, ON, Canada M5G 1L7
> Phone: 416-813-7780
> Fax: 416-813-8883
> email: bhargavi.duvvuri at sickkids.ca <mailto:bhargavi.duvvuri at sickkids.ca>
>

-- 
James W. MacDonald, M.S.
Biostatistician
University of Washington
Environmental and Occupational Health Sciences
4225 Roosevelt Way NE, # 100
Seattle WA 98105-6099



More information about the Bioconductor mailing list