[BioC] affyPara - rmaPara fails using cdfname

Tobias Verbeke tobias.verbeke at openanalytics.eu
Sat Jul 14 11:53:02 CEST 2012


L.S.

This issue only appears when using alternative cdfs
such as the hgu133plus2hsentrezgcdf from BrainArray

http://brainarray.mbni.med.umich.edu/brainarray/Database/CustomCDF/genomic_curated_CDF.asp

The patches below make affyPara work as well with such
cdf packages. In the current development version of the
package (1.17.0) the cdfname is not passed along to the
AffyBatch calls, which results in the error reported
below.

Best,
Tobias

diff --git a/affyPara/R/preproPara.R b/affyPara/R/preproPara.R
index d78eafb..63d2130 100644
--- a/affyPara/R/preproPara.R
+++ b/affyPara/R/preproPara.R
@@ -109,7 +109,7 @@
  	#################################
  	if (verbose) cat("Initialize AffyBatches at slaves ")
  		t0 <- proc.time();
-		check <- clusterApply(cluster, object.list, .initAffyBatchSF, 
object.type, ...)
+		check <- clusterApply(cluster, object.list, .initAffyBatchSF, 
object.type, cdfname = cdfname, ...)
  		t1 <- proc.time();
  	if (verbose) cat(round(t1[3]-t0[3],3),"sec DONE\n")
  	

diff --git a/affyPara/R/readAffybatchPara.R b/affyPara/R/readAffybatchPara.R
index 8da4ce0..c9edfcb 100644
--- a/affyPara/R/readAffybatchPara.R
+++ b/affyPara/R/readAffybatchPara.R
@@ -12,7 +12,7 @@
  read.affybatchPara <- function(object,
  		phenoData = new("AnnotatedDataFrame"),
  		description = NULL, notes = "",	
-		cluster, verbose=getOption("verbose"))
+		cluster, verbose=getOption("verbose"), cdfname)
  {
  	########
  	# Checks
@@ -66,7 +66,7 @@
  	##################################
  	if (verbose) cat("Create AffyBatches at slaves ")
  	t0 <- proc.time();
-	check <- clusterApply(cluster, object.list, .initAffyBatchSF, 
object.type)
+	check <- clusterApply(cluster, object.list, .initAffyBatchSF, 
object.type, cdfname = cdfname)
  	t1 <- proc.time();
  	if (verbose) cat(round(t1[3]-t0[3],3),"sec DONE\n")
  	

diff --git a/affyPara/man/readAffybatchPara.Rd 
b/affyPara/man/readAffybatchPara.Rd
index d84b733..f9036b8 100644
--- a/affyPara/man/readAffybatchPara.Rd
+++ b/affyPara/man/readAffybatchPara.Rd
@@ -22,6 +22,7 @@
     \item{cluster}{ A cluster object obtained from the function 
\link[snow:snow-startstop]{makeCluster} in the SNOW package.
    		For default \code{.affyParaInternalEnv$cl}  will be used. }
    \item{verbose}{ A logical value. If \code{TRUE} it writes out some 
messages. default: getOption("verbose") }
+  \item{cdfname}{cdfname to pass to the AffyBatch call}
   }
  \details{
  Parallelized creation of an AffyBatch object. Especially useful on 
multi-core machines

On 06/18/2012 03:51 PM, Wetzels, Yves [JRDBE Extern] wrote:
> Dear
>
> I am investigating whether affyPara can be used to analyze a large number of Microarray data.
> As a test case I have 20 CEL files. This works just fine running
>
> 	...	
> 	expressionSetRma<- rmaPara(files)
> 	...
>
> If I want to use the "cdfname" parameter however
>
> 	...
> 	expressionSetRma<- rmaPara(files, cdfname="hgu133plus2hsentrezg", verbose=TRUE)
> 	...
>
> I receive following error:
>
> 	Error in dimnames(eset_mat) <- list(ids, samples.names) :
> 	  length of 'dimnames' [1] not equal to array extent
> 	Calls: rmaPara -> preproPara -> .doSummarizationPara
> 	In addition: Warning messages:
> 	1: In is.na(xel) : is.na() applied to non-(list or vector) of type 'S4'
> 	2: In is.na(xel) : is.na() applied to non-(list or vector) of type 'S4'
> 	Execution halted
>
>
> I saw a thread http://answerpot.com/showthread.php?1408276-affyPara mentioning a bug in the .initAffyBatchSF function date 21/10/2010.
> Might this be the same bug ?
>
> Many thanks for your help and/or ideas.
>
>
> Kind Regards
>
> Yves Wetzels
> Contractor on behalf of
> Janssen
> Turnhoutseweg 30
> B-2340-Beerse, Belgium
>
>
> Below you`ll find the logfile/environment settings for both test runs.
>
> ******************************************************************
> Log for expressionSetRma<- rmaPara(files) => OK
>
> ******************************************************************
> ubuntu at ip-10-239-95-215:~/test$ cat runAffyPara.R.log.withoutcdf
>
> R version 2.15.0 (2012-03-30)
> Copyright (C) 2012 The R Foundation for Statistical Computing
> ISBN 3-900051-07-0
> Platform: x86_64-unknown-linux-gnu (64-bit)
>
> R is free software and comes with ABSOLUTELY NO WARRANTY.
> You are welcome to redistribute it under certain conditions.
> Type 'license()' or 'licence()' for distribution details.
>
>    Natural language support but running in an English locale
>
> R is a collaborative project with many contributors.
> Type 'contributors()' for more information and
> 'citation()' on how to cite R or R packages in publications.
>
> Type 'demo()' for some demos, 'help()' for on-line help, or
> 'help.start()' for an HTML browser interface to help.
> Type 'q()' to quit R.
>
>> library(affyPara)
> Loading required package: affy
> Loading required package: BiocGenerics
>
> Attaching package: âBiocGenericsâ
>
> The following object(s) are masked from âpackage:statsâ:
>
>      xtabs
>
> The following object(s) are masked from âpackage:baseâ:
>
>      anyDuplicated, cbind, colnames, duplicated, eval, Filter, Find,
>      get, intersect, lapply, Map, mapply, mget, order, paste, pmax,
>      pmax.int, pmin, pmin.int, Position, rbind, Reduce, rep.int,
>      rownames, sapply, setdiff, table, tapply, union, unique
>
> Loading required package: Biobase
> Welcome to Bioconductor
>
>      Vignettes contain introductory material; view with
>      'browseVignettes()'. To cite Bioconductor, see
>      'citation("Biobase")', and for packages 'citation("pkgname")'.
>
> Loading required package: snow
> Loading required package: vsn
> Loading required package: aplpack
> Loading required package: tcltk
> Loading Tcl/Tk interface ... done
>
> Attaching package: âaffyParaâ
>
> The following object(s) are masked from âpackage:snowâ:
>
>      makeCluster, stopCluster
>
> Warning message:
> In fun(libname, pkgname) : no DISPLAY variable so Tk is not available
>> library(hgu133plus2hsentrezgcdf)
>> sessionInfo()
> R version 2.15.0 (2012-03-30)
> Platform: x86_64-unknown-linux-gnu (64-bit)
>
> locale:
>   [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C
>   [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8
>   [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8
>   [7] LC_PAPER=C                 LC_NAME=C
>   [9] LC_ADDRESS=C               LC_TELEPHONE=C
> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
>
> attached base packages:
> [1] tcltk     stats     graphics  grDevices utils     datasets  methods
> [8] base
>
> other attached packages:
> [1] hgu133plus2hsentrezgcdf_12.1.0 affyPara_1.16.0
> [3] aplpack_1.2.6                  vsn_3.24.0
> [5] snow_0.3-9                     affy_1.34.0
> [7] Biobase_2.16.0                 BiocGenerics_0.2.0
>
> loaded via a namespace (and not attached):
> [1] affyio_1.24.0         BiocInstaller_1.4.6   grid_2.15.0
> [4] lattice_0.20-6        limma_3.12.1          preprocessCore_1.18.0
> [7] tools_2.15.0          zlibbioc_1.2.0
>> path <- "/home/ubuntu/test/cel_files"
>> makeCluster(2)
>> files <- list.celfiles(path, full.names=TRUE)
>> files
>   [1] "/home/ubuntu/test/cel_files/GSM686287.CEL"
>   [2] "/home/ubuntu/test/cel_files/GSM686289.CEL"
>   [3] "/home/ubuntu/test/cel_files/GSM686290.CEL"
>   [4] "/home/ubuntu/test/cel_files/GSM686291.CEL"
>   [5] "/home/ubuntu/test/cel_files/GSM686298.CEL"
>   [6] "/home/ubuntu/test/cel_files/GSM686300.CEL"
>   [7] "/home/ubuntu/test/cel_files/GSM686301.CEL"
>   [8] "/home/ubuntu/test/cel_files/GSM686303.CEL"
>   [9] "/home/ubuntu/test/cel_files/GSM686304.CEL"
> [10] "/home/ubuntu/test/cel_files/GSM686305.CEL"
> [11] "/home/ubuntu/test/cel_files/GSM686310.CEL"
> [12] "/home/ubuntu/test/cel_files/GSM686311.CEL"
> [13] "/home/ubuntu/test/cel_files/GSM686314.CEL"
> [14] "/home/ubuntu/test/cel_files/GSM686316.CEL"
> [15] "/home/ubuntu/test/cel_files/GSM686319.CEL"
> [16] "/home/ubuntu/test/cel_files/GSM686320.CEL"
> [17] "/home/ubuntu/test/cel_files/GSM686322.CEL"
> [18] "/home/ubuntu/test/cel_files/GSM686323.CEL"
> [19] "/home/ubuntu/test/cel_files/GSM686324.CEL"
> [20] "/home/ubuntu/test/cel_files/GSM686325.CEL"
>> expressionSetRma<- rmaPara(files)
> Loading required package: AnnotationDbi
>
>
> Attaching package: âhgu133plus2cdfâ
>
> The following object(s) are masked from âpackage:hgu133plus2hsentrezgcdfâ:
>
>      i2xy, xy2i
>
> Warning messages:
> 1: In is.na(xel) : is.na() applied to non-(list or vector) of type 'S4'
> 2: In is.na(xel) : is.na() applied to non-(list or vector) of type 'S4'
>> stopCluster()
>> write.exprs(expressionSetRma,file="/home/ubuntu/test/expressionSetRma.Rda")
>>
>
>
>
> ******************************************************************
> Log for expressionSetRma<- rmaPara(files) => ERROR
>
> ******************************************************************
> ubuntu at ip-10-239-95-215:~/test$ more runAffyPara.R.log.withcdf
>
> R version 2.15.0 (2012-03-30)
> Copyright (C) 2012 The R Foundation for Statistical Computing
> ISBN 3-900051-07-0
> Platform: x86_64-unknown-linux-gnu (64-bit)
>
> R is free software and comes with ABSOLUTELY NO WARRANTY.
> You are welcome to redistribute it under certain conditions.
> Type 'license()' or 'licence()' for distribution details.
>
>    Natural language support but running in an English locale
>
> R is a collaborative project with many contributors.
> Type 'contributors()' for more information and
> 'citation()' on how to cite R or R packages in publications.
>
> Type 'demo()' for some demos, 'help()' for on-line help, or
> 'help.start()' for an HTML browser interface to help.
> Type 'q()' to quit R.
>
>> library(affyPara)
> Loading required package: affy
> Loading required package: BiocGenerics
>
> Attaching package: âBiocGenericsâ
>
> The following object(s) are masked from âpackage:statsâ:
>
>      xtabs
>
> The following object(s) are masked from âpackage:baseâ:
>
>      anyDuplicated, cbind, colnames, duplicated, eval, Filter, Find,
>      get, intersect, lapply, Map, mapply, mget, order, paste, pmax,
>      pmax.int, pmin, pmin.int, Position, rbind, Reduce, rep.int,
>      rownames, sapply, setdiff, table, tapply, union, unique
>
> Loading required package: Biobase
> Welcome to Bioconductor
>
>      Vignettes contain introductory material; view with
>      'browseVignettes()'. To cite Bioconductor, see
>      'citation("Biobase")', and for packages 'citation("pkgname")'.
>
> Loading required package: snow
> Loading required package: vsn
> Loading required package: aplpack
> Loading required package: tcltk
> Loading Tcl/Tk interface ... done
>
> Attaching package: âaffyParaâ
>
> The following object(s) are masked from âpackage:snowâ:
>
>      makeCluster, stopCluster
>
> Warning message:
> In fun(libname, pkgname) : no DISPLAY variable so Tk is not available
>> library(hgu133plus2hsentrezgcdf)
>> sessionInfo()
> R version 2.15.0 (2012-03-30)
> Platform: x86_64-unknown-linux-gnu (64-bit)
>
> locale:
>   [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C
>   [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8
>   [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8
>   [7] LC_PAPER=C                 LC_NAME=C
>   [9] LC_ADDRESS=C               LC_TELEPHONE=C
> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
>
> attached base packages:
> [1] tcltk     stats     graphics  grDevices utils     datasets  methods
> [8] base
>
> other attached packages:
> [1] hgu133plus2hsentrezgcdf_12.1.0 affyPara_1.16.0
> [3] aplpack_1.2.6                  vsn_3.24.0
> [5] snow_0.3-9                     affy_1.34.0
> [7] Biobase_2.16.0                 BiocGenerics_0.2.0
>
> loaded via a namespace (and not attached):
> [1] affyio_1.24.0         BiocInstaller_1.4.6   grid_2.15.0
> [4] lattice_0.20-6        limma_3.12.1          preprocessCore_1.18.0
> [7] tools_2.15.0          zlibbioc_1.2.0
>> path <- "/home/ubuntu/test/cel_files"
>> makeCluster(2)
>> files <- list.celfiles(path, full.names=TRUE)
>> files
>   [1] "/home/ubuntu/test/cel_files/GSM686287.CEL"
>   [2] "/home/ubuntu/test/cel_files/GSM686289.CEL"
>   [3] "/home/ubuntu/test/cel_files/GSM686290.CEL"
>   [4] "/home/ubuntu/test/cel_files/GSM686291.CEL"
>   [5] "/home/ubuntu/test/cel_files/GSM686298.CEL"
>   [6] "/home/ubuntu/test/cel_files/GSM686300.CEL"
>   [7] "/home/ubuntu/test/cel_files/GSM686301.CEL"
>   [8] "/home/ubuntu/test/cel_files/GSM686303.CEL"
>   [9] "/home/ubuntu/test/cel_files/GSM686304.CEL"
> [10] "/home/ubuntu/test/cel_files/GSM686305.CEL"
> [11] "/home/ubuntu/test/cel_files/GSM686310.CEL"
> [12] "/home/ubuntu/test/cel_files/GSM686311.CEL"
> [13] "/home/ubuntu/test/cel_files/GSM686314.CEL"
> [14] "/home/ubuntu/test/cel_files/GSM686316.CEL"
> [15] "/home/ubuntu/test/cel_files/GSM686319.CEL"
> [16] "/home/ubuntu/test/cel_files/GSM686320.CEL"
> [17] "/home/ubuntu/test/cel_files/GSM686322.CEL"
> [18] "/home/ubuntu/test/cel_files/GSM686323.CEL"
> [19] "/home/ubuntu/test/cel_files/GSM686324.CEL"
> [20] "/home/ubuntu/test/cel_files/GSM686325.CEL"
>> expressionSetRma<- rmaPara(files, cdfname="hgu133plus2hsentrezg", verbose=TRUE)
> Error in dimnames(eset_mat) <- list(ids, samples.names) :
>    length of 'dimnames' [1] not equal to array extent
> Calls: rmaPara -> preproPara -> .doSummarizationPara
> In addition: Warning messages:
> 1: In is.na(xel) : is.na() applied to non-(list or vector) of type 'S4'
> 2: In is.na(xel) : is.na() applied to non-(list or vector) of type 'S4'
> Execution halted
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
>



More information about the Bioconductor mailing list