[BioC] makePDPackage problem? -- Affy promoter arrays
Mark Robinson
mrobinson at wehi.EDU.AU
Mon Aug 4 06:48:03 CEST 2008
Thanks Benilton.
Next problem is then:
> bpmapFile <- "Hs_PromPR_v02-3_NCBIv36.affy.bpmap"
> cifFile <- "Hs_PromPR_v02.cif"
> obj <- new("AffyTilingPDInfoPkgSeed",
+ version="0.0.1",
+ author="Mark Robinson", email="mrobinson at ...",
+ biocViews="AnnotationData",
+ genomebuild="NCBI Build 36",
+ bpmapFile=bpmapFile,
+ cifFile=cifFile)
> makePdInfoPackage(obj, destDir=".")
Creating package in ./pd.hs.prompr.v02.3.ncbiv36.affy
Error in sqliteExecStatement(conn, statement, bind.data, ...) :
RS-DBI driver: (RS_SQLite_exec: could not execute: PRIMARY KEY must
be unique)
Timing stopped at: 4.013 0.036 4.095
> traceback()
14: .Call("RS_SQLite_exec", conId, statement, bind.data, PACKAGE
= .SQLitePkgName)
13: sqliteExecStatement(conn, statement, bind.data, ...)
12: is(object, Cl)
11: is(object, Cl)
10: .valueClassTest(standardGeneric("dbSendPreparedQuery"), "DBIResult",
"dbSendPreparedQuery")
9: dbSendPreparedQuery(db, sql, batchMat[!isPm, ])
8: loadUnits.affyTiling(db, qcunits, nx = nx, isQc = TRUE)
7: loadUnitsByBatch.affyTiling(db, bpmapFile, batch_size = batch_size,
nx = nx)
6: eval(expr, envir, enclos)
5: eval(expr, envir = loc.frame)
4: ST(loadUnitsByBatch.affyTiling(db, bpmapFile, batch_size =
batch_size,
nx = nx))
3: buildPdInfoDb.affyTiling(object at bpmapFile, object at cifFile,
dbFilePath,
seqMatFile, batch_size = batch_size, verbose = !quiet)
2: makePdInfoPackage(obj, destDir = ".")
1: makePdInfoPackage(obj, destDir = ".")
> sessionInfo()
R version 2.7.0 (2008-04-22)
x86_64-unknown-linux-gnu
locale:
LC_CTYPE
=
en_US
.UTF
-8
;LC_NUMERIC
=
C
;LC_TIME
=
en_US
.UTF
-8
;LC_COLLATE
=
en_US
.UTF
-8
;LC_MONETARY
=
C
;LC_MESSAGES
=
en_US
.UTF
-8
;LC_PAPER
=
en_US
.UTF
-8
;LC_NAME
=
C
;LC_ADDRESS
=C;LC_TELEPHONE=C;LC_MEASUREMENT=en_US.UTF-8;LC_IDENTIFICATION=C
attached base packages:
[1] splines tools stats graphics grDevices utils datasets
[8] methods base
other attached packages:
[1] pdInfoBuilder_1.4.0 oligo_1.4.0 oligoClasses_1.2.0
[4] AnnotationDbi_1.2.0 preprocessCore_1.2.0 affxparser_1.12.2
[7] RSQLite_0.6-9 DBI_0.2-4 Biobase_2.0.1
It seems others have had this problem, but I couldn't find a response:
http://article.gmane.org/gmane.science.biology.informatics.conductor/18440/
Cheers,
Mark
On 04/08/2008, at 1:10 PM, Benilton Carvalho wrote:
> Dear Mark,
>
> your finding is due to the fact you didn't provide a CIF file.
>
> But, anyways, I strongly recommend you to use the pdInfoBuilder
> package instead as the following message suggests:
>
> http://article.gmane.org/gmane.science.biology.informatics.conductor/19140
>
> best regards,
>
> b
>
>
> On Aug 3, 2008, at 9:02 PM, Mark Robinson wrote:
>
>> Hi all.
>>
>> This may in fact actually not be a problem, it may be something
>> silly that I'm doing. But, something strikes me as odd. Below is
>> my explanation.
>>
>>
>> I am working with the Affymetrix promoter tiling arrays. My
>> starting point is a BPMAP file, which you can get from Affy library
>> bundle at:
>>
>> http://www.affymetrix.com/products/arrays/specific/
>> human_promoter.affx
>>
>> ... or I have also been using the re-worked BPMAP file you can get
>> from the people who developed MAT:
>>
>> http://chip.dfci.harvard.edu/~wli/MAT/Download.htm
>>
>> So, I use 'makePDPackage' to create a R package, along the lines of:
>>
>> (I've just renamed the downloaded BPMAP file to have 'affy' or
>> 'harvard' in the name of the file, so that I can remember how to
>> tell them apart)
>>
>> >
>> makePDpackage
>> ("Hs_PromPR_v02
>> -3_NCBIv36
>> .affy.bpmap",type="tiling",manufacturer="affymetrix",genome="hg18")
>> affymetrix tiling
>> The package will be called pd.hs.prompr.v02.3.ncbiv36.affy
>> Array identified as having 914 rows and 914 columns.
>> Creating package in /export/share/disk501/lab0605/mrobinson/
>> projects/microarray/pd.hs.prompr.v02.3.ncbiv36.affy
>>
>> >
>> makePDpackage
>> ("Hs_PromPR_v01
>> -3_NCBIv36
>> .NR
>> .harvard
>> .bpmap",type="tiling",manufacturer="affymetrix",genome="hg18")
>> affymetrix tiling
>> The package will be called pd.hs.prompr.v01.3.ncbiv36.nr.harvard
>> Array identified as having 914 rows and 914 columns.
>> Creating package in /export/share/disk501/lab0605/mrobinson/
>> projects/microarray/pd.hs.prompr.v01.3.ncbiv36.nr.harvard
>>
>> ... then do R CMD INSTALL ... from the command prompt. One thing
>> that strikes me as odd is the fact that it recognizes it as having
>> 914 rows and columns. See below.
>>
>> So, I read in the data for a single file and look at the raw data
>> for a particular X and Y location on the chip. And compare this to
>> what I get from 'readCel' in the affxparser package.
>>
>>
>> > rd<-read.celfiles("CEL/
>> test1.CEL",pkgname="pd.hs.prompr.v01.3.ncbiv36.nr.harvard")
>> Platform design info loaded.
>> The intensity matrix will require 35.79 MB of RAM.
>> > pd<-getPD(rd)
>> > length(pd$X)
>> [1] 4286817
>> > dim(rd)
>> Features Samples
>> 4286817 1
>> > w<-which(pd$X==1344 & pd$Y==854)
>> > w
>> [1] 1267129
>> > exprs(rd)[w,]
>> [1] 123
>>
>>
>> > library(affxparser)
>> > x<-readCel("CEL/test1.CEL",readXY=TRUE)
>> > x$header[c("rows","cols")]
>> $rows
>> [1] 2166
>>
>> $cols
>> [1] 2166
>> > w<-which(x$x==1344 & x$y==854)
>> > w
>> [1] 1851109
>> > x$intensities[w]
>> [1] 8074
>>
>> So, this chip does have 2166 rows and columns, which could be
>> introducing problems in the indexing. I haven't dug any deeper on
>> this.
>>
>> Anyone know what is happening? Is this a problem in making the
>> package through 'makePDPackage', or do I misunderstand the
>> correspondence between the elements of a 'TilingFeatureSet' and the
>> corresponding 'platformDesign' object?
>>
>> Thanks!
>> Mark
>>
>> > sessionInfo()
>> R version 2.7.0 (2008-04-22)
>> x86_64-unknown-linux-gnu
>>
>> locale:
>> LC_CTYPE
>> =
>> en_US
>> .UTF
>> -8
>> ;LC_NUMERIC
>> =
>> C
>> ;LC_TIME
>> =
>> en_US
>> .UTF
>> -8
>> ;LC_COLLATE
>> =
>> en_US
>> .UTF
>> -8
>> ;LC_MONETARY
>> =
>> C
>> ;LC_MESSAGES
>> =
>> en_US
>> .UTF
>> -8
>> ;LC_PAPER
>> =
>> en_US
>> .UTF
>> -8
>> ;LC_NAME
>> =
>> C
>> ;LC_ADDRESS
>> =C;LC_TELEPHONE=C;LC_MEASUREMENT=en_US.UTF-8;LC_IDENTIFICATION=C
>>
>> attached base packages:
>> [1] splines tools stats graphics grDevices utils
>> datasets
>> [8] methods base
>>
>> other attached packages:
>> [1] makePlatformDesign_1.4.0
>> [2] affyio_1.8.0
>> [3] pd.hs.prompr.v01.3.ncbiv36.nr.harvard_1.4.0
>> [4] oligo_1.4.0
>> [5] oligoClasses_1.2.0
>> [6] AnnotationDbi_1.2.0
>> [7] preprocessCore_1.2.0
>> [8] RSQLite_0.6-9
>> [9] DBI_0.2-4
>> [10] Biobase_2.0.1
>> [11] affxparser_1.12.2
>>
>>
>> ------------------------------
>> Mark Robinson
>> Epigenetics Laboratory, Garvan
>> Bioinformatics Division, WEHI
>> e: m.robinson at garvan.org.au
>> e: mrobinson at wehi.edu.au
>> p: +61 (0)3 9345 2628
>> f: +61 (0)3 9347 0852
>>
>> _______________________________________________
>> Bioconductor mailing list
>> Bioconductor at stat.math.ethz.ch
>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
>
------------------------------
Mark Robinson
Epigenetics Laboratory, Garvan
Bioinformatics Division, WEHI
e: m.robinson at garvan.org.au
e: mrobinson at wehi.edu.au
p: +61 (0)3 9345 2628
f: +61 (0)3 9347 0852
More information about the Bioconductor
mailing list