[BioC] makePdInfoPackage for Primeview arrays

Wed Jun 19 09:49:44 CEST 2013

Thanks everybody for your replies!
Maybe a short addon: the reason I wanted to make this pd.info file was that
I wanted to try SCAN.UPC on this arrays (which wants that file). Otherwise I
tried already rma with the probe and cdf files from the Bioconductor site.
And this worked just fine. At least it seemed fine to me - now with probes
mapping to multiple probesets, I wonder if that could do something funny to
the analysis.

Best,
Max

-----Original Message-----
From: cstrato [mailto:cstrato at aon.at] 
Sent: Tuesday, June 18, 2013 9:18 PM
To: Max Kauer
Cc: Bioconductor at r-project.org
Subject: Re: [BioC] makePdInfoPackage for Primeview arrays

Dear Max,

In principle you could also use package xps, which can handle PrimeView
arrays. To create a root 'scheme' file (see vignette xps.pdf) you simply
need to do:

### new R session: load library xps
library(xps)

### define directories:
# directory containing Affymetrix library files libdir <-
"/Volumes/GigaDrive/Affy/libraryfiles"
# directory containing Affymetrix annotation files anndir <-
"/Volumes/GigaDrive/Affy/Annotation"
# directory to store ROOT scheme files
scmdir <- "/Volumes/GigaDrive/CRAN/Workspaces/Schemes"

### create scheme file:
scheme.primeview <- import.expr.scheme("primeview", filedir =
file.path(scmdir, "na33"),
                          schemefile = file.path(libdir, "PrimeView.CDF"),
                          probefile  = file.path(libdir,
"PrimeView.probe.tab"),
                          annotfile  = file.path(anndir, "Version12Nov",
"PrimeView.na33.annot.csv"))

For more information and examples see also the example scripts in
xps/examples/script4schemes.R and xps/examples/script4xps.R

Best regards,
Christian
_._._._._._._._._._._._._._._._._._
C.h.r.i.s.t.i.a.n   S.t.r.a.t.o.w.a
V.i.e.n.n.a           A.u.s.t.r.i.a
e.m.a.i.l:        cstrato at aon.at
_._._._._._._._._._._._._._._._._._

On 6/18/13 3:25 PM, Max Kauer wrote:
> Hi,
>
> I am trying to make a pd.info package for the Affy Primeview array, 
> but I get an error.
>
> Thanks for any help!
>
> Cheers,
>
> Max
>
>
>
>
>
> This is my code:
>
>
>
> library(pdInfoBuilder)
>
> cdf <- list.files( pathAnnotPr, pattern = ".cdf", full.names = TRUE )
>
> cel <- list.files( pathC, pattern = ".CEL", full.names = TRUE )[1] #  
> take first array
>
> tab <- list.files(pathAnnotPr, pattern = "_tab", full.names = TRUE)
>
>
>
> seed <- new("AffyExpressionPDInfoPkgSeed",
>
>        cdfFile = cdf, celFile = cel,
>
>        tabSeqFile = tab, author = "xx",
>
>        email = "xx",
>
>        biocViews = "AnnotationData",
>
>        genomebuild = "hg19",
>
>        organism = "Human", species = "Homo Sapiens",
>
>        url = "xx"
>
> )
>
> makePdInfoPackage( seed, destDir = "." )
>
>
>
>
>
>
>
> Which produces this output/error (although a pd.primeview directory is
> created):
>
>
>
> ======================================================================
> ======
> ====
>
> Building annotation package for Affymetrix Expression array
>
> CDF...............:  PrimeView.cdf
>
> CEL...............:  MJ_05042013_TAS_10_PrimeView.CEL
>
> Sequence TAB-Delim:  PrimeView.probe_tab
>
> ======================================================================
> ======
> ====
>
> Parsing file: PrimeView.cdf... OK
>
> Parsing file: MJ_05042013_TAS_10_PrimeView.CEL... OK
>
> Parsing file: PrimeView.probe_tab... OK
>
> Getting information for featureSet table... OK
>
> Getting information for pm/mm feature tables...
>
> OK
>
> Combining probe information with sequence information... OK
>
> Getting PM probes and sequences... OK
>
> Done parsing.
>
> Creating package in ./pd.primeview
>
> Inserting 49395 rows into table featureSet... OK
>
> Inserting 609663 rows into table pmfeature... Error in 
> sqliteExecStatement(con, statement, bind.data) :
>
>    RS-DBI driver: (RS_SQLite_exec: could not execute: PRIMARY KEY must 
> be
> unique)
>
> In addition: Warning messages:
>
> 1: In parseCdfCelProbe(object at cdfFile, object at celFile, object at tabSeqFile,
:
>
>    Probe sequences were not found for all PM probes. These probes will 
> be removed from the pmSequence object.
>
> 2: In parseCdfCelProbe(object at cdfFile, object at celFile, object at tabSeqFile,
:
>
>    Probe sequences were not found for all MM probes. These probes will 
> be removed from the mmSequence object.
>
>
>
>
>
>
>
>> sessionInfo()
>
> R version 3.0.0 (2013-04-03)
>
> Platform: x86_64-unknown-linux-gnu (64-bit)
>
>
>
> locale:
>
> [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C
>
>   [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8
>
>   [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8
>
>   [7] LC_PAPER=C                 LC_NAME=C
>
>   [9] LC_ADDRESS=C               LC_TELEPHONE=C
>
> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
>
>
>
> attached base packages:
>
> [1] parallel  stats     graphics  grDevices utils     datasets  methods
>
> [8] base
>
>
>
> other attached packages:
>
> [1] pdInfoBuilder_1.24.0 oligo_1.24.0         oligoClasses_1.22.0
>
> [4] affxparser_1.32.1    RSQLite_0.11.4       DBI_0.2-7
>
> [7] Biobase_2.20.0       BiocGenerics_0.6.0
>
>
>
> loaded via a namespace (and not attached):
>
> [1] affyio_1.28.0         BiocInstaller_1.10.2  Biostrings_2.28.0
>
>   [4] bit_1.1-10            codetools_0.2-8       ff_2.2-11
>
>   [7] foreach_1.4.1         GenomicRanges_1.12.4  IRanges_1.18.1
>
> [10] iterators_1.0.6       preprocessCore_1.22.0 splines_3.0.0
>
> [13] stats4_3.0.0          zlibbioc_1.6.0
>
>>
>
>
>
> Max Kauer
>
> CHILDREN'S CANCER RESEARCH INSTITUTE
>
>
>
>
> 	[[alternative HTML version deleted]]
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: 
> http://news.gmane.org/gmane.science.biology.informatics.conductor
>