[BioC] makePdInfoPackage for Primeview arrays

cstrato cstrato at aon.at
Tue Jun 18 21:18:09 CEST 2013


Dear Max,

In principle you could also use package xps, which can handle PrimeView 
arrays. To create a root 'scheme' file (see vignette xps.pdf) you simply 
need to do:

### new R session: load library xps
library(xps)

### define directories:
# directory containing Affymetrix library files
libdir <- "/Volumes/GigaDrive/Affy/libraryfiles"
# directory containing Affymetrix annotation files
anndir <- "/Volumes/GigaDrive/Affy/Annotation"
# directory to store ROOT scheme files
scmdir <- "/Volumes/GigaDrive/CRAN/Workspaces/Schemes"

### create scheme file:
scheme.primeview <- import.expr.scheme("primeview", filedir = 
file.path(scmdir, "na33"),
                          schemefile = file.path(libdir, "PrimeView.CDF"),
                          probefile  = file.path(libdir, 
"PrimeView.probe.tab"),
                          annotfile  = file.path(anndir, "Version12Nov", 
"PrimeView.na33.annot.csv"))

For more information and examples see also the example scripts in 
xps/examples/script4schemes.R and xps/examples/script4xps.R

Best regards,
Christian
_._._._._._._._._._._._._._._._._._
C.h.r.i.s.t.i.a.n   S.t.r.a.t.o.w.a
V.i.e.n.n.a           A.u.s.t.r.i.a
e.m.a.i.l:        cstrato at aon.at
_._._._._._._._._._._._._._._._._._



On 6/18/13 3:25 PM, Max Kauer wrote:
> Hi,
>
> I am trying to make a pd.info package for the Affy Primeview array, but I
> get an error.
>
> Thanks for any help!
>
> Cheers,
>
> Max
>
>
>
>
>
> This is my code:
>
>
>
> library(pdInfoBuilder)
>
> cdf <- list.files( pathAnnotPr, pattern = ".cdf", full.names = TRUE )
>
> cel <- list.files( pathC, pattern = ".CEL", full.names = TRUE )[1] #  take
> first array
>
> tab <- list.files(pathAnnotPr, pattern = "_tab", full.names = TRUE)
>
>
>
> seed <- new("AffyExpressionPDInfoPkgSeed",
>
>        cdfFile = cdf, celFile = cel,
>
>        tabSeqFile = tab, author = "xx",
>
>        email = "xx",
>
>        biocViews = "AnnotationData",
>
>        genomebuild = "hg19",
>
>        organism = "Human", species = "Homo Sapiens",
>
>        url = "xx"
>
> )
>
> makePdInfoPackage( seed, destDir = "." )
>
>
>
>
>
>
>
> Which produces this output/error (although a pd.primeview directory is
> created):
>
>
>
> ============================================================================
> ====
>
> Building annotation package for Affymetrix Expression array
>
> CDF...............:  PrimeView.cdf
>
> CEL...............:  MJ_05042013_TAS_10_PrimeView.CEL
>
> Sequence TAB-Delim:  PrimeView.probe_tab
>
> ============================================================================
> ====
>
> Parsing file: PrimeView.cdf... OK
>
> Parsing file: MJ_05042013_TAS_10_PrimeView.CEL... OK
>
> Parsing file: PrimeView.probe_tab... OK
>
> Getting information for featureSet table... OK
>
> Getting information for pm/mm feature tables...
>
> OK
>
> Combining probe information with sequence information... OK
>
> Getting PM probes and sequences... OK
>
> Done parsing.
>
> Creating package in ./pd.primeview
>
> Inserting 49395 rows into table featureSet... OK
>
> Inserting 609663 rows into table pmfeature... Error in
> sqliteExecStatement(con, statement, bind.data) :
>
>    RS-DBI driver: (RS_SQLite_exec: could not execute: PRIMARY KEY must be
> unique)
>
> In addition: Warning messages:
>
> 1: In parseCdfCelProbe(object at cdfFile, object at celFile, object at tabSeqFile,  :
>
>    Probe sequences were not found for all PM probes. These probes will be
> removed from the pmSequence object.
>
> 2: In parseCdfCelProbe(object at cdfFile, object at celFile, object at tabSeqFile,  :
>
>    Probe sequences were not found for all MM probes. These probes will be
> removed from the mmSequence object.
>
>
>
>
>
>
>
>> sessionInfo()
>
> R version 3.0.0 (2013-04-03)
>
> Platform: x86_64-unknown-linux-gnu (64-bit)
>
>
>
> locale:
>
> [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C
>
>   [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8
>
>   [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8
>
>   [7] LC_PAPER=C                 LC_NAME=C
>
>   [9] LC_ADDRESS=C               LC_TELEPHONE=C
>
> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
>
>
>
> attached base packages:
>
> [1] parallel  stats     graphics  grDevices utils     datasets  methods
>
> [8] base
>
>
>
> other attached packages:
>
> [1] pdInfoBuilder_1.24.0 oligo_1.24.0         oligoClasses_1.22.0
>
> [4] affxparser_1.32.1    RSQLite_0.11.4       DBI_0.2-7
>
> [7] Biobase_2.20.0       BiocGenerics_0.6.0
>
>
>
> loaded via a namespace (and not attached):
>
> [1] affyio_1.28.0         BiocInstaller_1.10.2  Biostrings_2.28.0
>
>   [4] bit_1.1-10            codetools_0.2-8       ff_2.2-11
>
>   [7] foreach_1.4.1         GenomicRanges_1.12.4  IRanges_1.18.1
>
> [10] iterators_1.0.6       preprocessCore_1.22.0 splines_3.0.0
>
> [13] stats4_3.0.0          zlibbioc_1.6.0
>
>>
>
>
>
> Max Kauer
>
> CHILDREN'S CANCER RESEARCH INSTITUTE
>
>
>
>
> 	[[alternative HTML version deleted]]
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
>



More information about the Bioconductor mailing list