[BioC] makePDPackage problem? -- Affy promoter arrays

Mark Robinson mrobinson at wehi.EDU.AU
Mon Aug 4 06:48:03 CEST 2008


Thanks Benilton.

Next problem is then:

 > bpmapFile <- "Hs_PromPR_v02-3_NCBIv36.affy.bpmap"
 > cifFile <- "Hs_PromPR_v02.cif"
 > obj <- new("AffyTilingPDInfoPkgSeed",
+           version="0.0.1",
+           author="Mark Robinson", email="mrobinson at ...",
+           biocViews="AnnotationData",
+           genomebuild="NCBI Build 36",
+           bpmapFile=bpmapFile,
+           cifFile=cifFile)
 > makePdInfoPackage(obj, destDir=".")
Creating package in ./pd.hs.prompr.v02.3.ncbiv36.affy
Error in sqliteExecStatement(conn, statement, bind.data, ...) :
   RS-DBI driver: (RS_SQLite_exec: could not execute: PRIMARY KEY must  
be unique)
Timing stopped at: 4.013 0.036 4.095

 > traceback()
14: .Call("RS_SQLite_exec", conId, statement, bind.data, PACKAGE  
= .SQLitePkgName)
13: sqliteExecStatement(conn, statement, bind.data, ...)
12: is(object, Cl)
11: is(object, Cl)
10: .valueClassTest(standardGeneric("dbSendPreparedQuery"), "DBIResult",
         "dbSendPreparedQuery")
9: dbSendPreparedQuery(db, sql, batchMat[!isPm, ])
8: loadUnits.affyTiling(db, qcunits, nx = nx, isQc = TRUE)
7: loadUnitsByBatch.affyTiling(db, bpmapFile, batch_size = batch_size,
        nx = nx)
6: eval(expr, envir, enclos)
5: eval(expr, envir = loc.frame)
4: ST(loadUnitsByBatch.affyTiling(db, bpmapFile, batch_size =  
batch_size,
        nx = nx))
3: buildPdInfoDb.affyTiling(object at bpmapFile, object at cifFile,  
dbFilePath,
        seqMatFile, batch_size = batch_size, verbose = !quiet)
2: makePdInfoPackage(obj, destDir = ".")
1: makePdInfoPackage(obj, destDir = ".")
 > sessionInfo()
R version 2.7.0 (2008-04-22)
x86_64-unknown-linux-gnu

locale:
LC_CTYPE 
= 
en_US 
.UTF 
-8 
;LC_NUMERIC 
= 
C 
;LC_TIME 
= 
en_US 
.UTF 
-8 
;LC_COLLATE 
= 
en_US 
.UTF 
-8 
;LC_MONETARY 
= 
C 
;LC_MESSAGES 
= 
en_US 
.UTF 
-8 
;LC_PAPER 
= 
en_US 
.UTF 
-8 
;LC_NAME 
= 
C 
;LC_ADDRESS 
=C;LC_TELEPHONE=C;LC_MEASUREMENT=en_US.UTF-8;LC_IDENTIFICATION=C

attached base packages:
[1] splines   tools     stats     graphics  grDevices utils     datasets
[8] methods   base

other attached packages:
[1] pdInfoBuilder_1.4.0  oligo_1.4.0          oligoClasses_1.2.0
[4] AnnotationDbi_1.2.0  preprocessCore_1.2.0 affxparser_1.12.2
[7] RSQLite_0.6-9        DBI_0.2-4            Biobase_2.0.1

It seems others have had this problem, but I couldn't find a response:

http://article.gmane.org/gmane.science.biology.informatics.conductor/18440/


Cheers,
Mark




On 04/08/2008, at 1:10 PM, Benilton Carvalho wrote:

> Dear Mark,
>
> your finding is due to the fact you didn't provide a CIF file.
>
> But, anyways, I strongly recommend you to use the pdInfoBuilder  
> package instead as the following message suggests:
>
> http://article.gmane.org/gmane.science.biology.informatics.conductor/19140
>
> best regards,
>
> b
>
>
> On Aug 3, 2008, at 9:02 PM, Mark Robinson wrote:
>
>> Hi all.
>>
>> This may in fact actually not be a problem, it may be something  
>> silly that I'm doing.  But, something strikes me as odd.  Below is  
>> my explanation.
>>
>>
>> I am working with the Affymetrix promoter tiling arrays.  My  
>> starting point is a BPMAP file, which you can get from Affy library  
>> bundle at:
>>
>> http://www.affymetrix.com/products/arrays/specific/ 
>> human_promoter.affx
>>
>> ... or I have also been using the re-worked BPMAP file you can get  
>> from the people who developed MAT:
>>
>> http://chip.dfci.harvard.edu/~wli/MAT/Download.htm
>>
>> So, I use 'makePDPackage' to create a R package, along the lines of:
>>
>> (I've just renamed the downloaded BPMAP file to have 'affy' or  
>> 'harvard' in the name of the file, so that I can remember how to  
>> tell them apart)
>>
>> >  
>> makePDpackage 
>> ("Hs_PromPR_v02 
>> -3_NCBIv36 
>> .affy.bpmap",type="tiling",manufacturer="affymetrix",genome="hg18")
>> affymetrix tiling
>> The package will be called pd.hs.prompr.v02.3.ncbiv36.affy
>> Array identified as having 914 rows and 914 columns.
>> Creating package in /export/share/disk501/lab0605/mrobinson/ 
>> projects/microarray/pd.hs.prompr.v02.3.ncbiv36.affy
>>
>> >  
>> makePDpackage 
>> ("Hs_PromPR_v01 
>> -3_NCBIv36 
>> .NR 
>> .harvard 
>> .bpmap",type="tiling",manufacturer="affymetrix",genome="hg18")
>> affymetrix tiling
>> The package will be called pd.hs.prompr.v01.3.ncbiv36.nr.harvard
>> Array identified as having 914 rows and 914 columns.
>> Creating package in /export/share/disk501/lab0605/mrobinson/ 
>> projects/microarray/pd.hs.prompr.v01.3.ncbiv36.nr.harvard
>>
>> ... then do R CMD INSTALL ... from the command prompt.  One thing  
>> that strikes me as odd is the fact that it recognizes it as having  
>> 914 rows and columns.  See below.
>>
>> So, I read in the data for a single file and look at the raw data  
>> for a particular X and Y location on the chip.  And compare this to  
>> what I get from 'readCel' in the affxparser package.
>>
>>
>> > rd<-read.celfiles("CEL/ 
>> test1.CEL",pkgname="pd.hs.prompr.v01.3.ncbiv36.nr.harvard")
>> Platform design info loaded.
>> The intensity matrix will require 35.79 MB of RAM.
>> > pd<-getPD(rd)
>> > length(pd$X)
>> [1] 4286817
>> > dim(rd)
>> Features  Samples
>> 4286817        1
>> > w<-which(pd$X==1344 & pd$Y==854)
>> > w
>> [1] 1267129
>> > exprs(rd)[w,]
>> [1] 123
>>
>>
>> > library(affxparser)
>> > x<-readCel("CEL/test1.CEL",readXY=TRUE)
>> > x$header[c("rows","cols")]
>> $rows
>> [1] 2166
>>
>> $cols
>> [1] 2166
>> > w<-which(x$x==1344 & x$y==854)
>> > w
>> [1] 1851109
>> > x$intensities[w]
>> [1] 8074
>>
>> So, this chip does have 2166 rows and columns, which could be  
>> introducing problems in the indexing.  I haven't dug any deeper on  
>> this.
>>
>> Anyone know what is happening?  Is this a problem in making the  
>> package through 'makePDPackage', or do I misunderstand the  
>> correspondence between the elements of a 'TilingFeatureSet' and the  
>> corresponding 'platformDesign' object?
>>
>> Thanks!
>> Mark
>>
>> > sessionInfo()
>> R version 2.7.0 (2008-04-22)
>> x86_64-unknown-linux-gnu
>>
>> locale:
>> LC_CTYPE 
>> = 
>> en_US 
>> .UTF 
>> -8 
>> ;LC_NUMERIC 
>> = 
>> C 
>> ;LC_TIME 
>> = 
>> en_US 
>> .UTF 
>> -8 
>> ;LC_COLLATE 
>> = 
>> en_US 
>> .UTF 
>> -8 
>> ;LC_MONETARY 
>> = 
>> C 
>> ;LC_MESSAGES 
>> = 
>> en_US 
>> .UTF 
>> -8 
>> ;LC_PAPER 
>> = 
>> en_US 
>> .UTF 
>> -8 
>> ;LC_NAME 
>> = 
>> C 
>> ;LC_ADDRESS 
>> =C;LC_TELEPHONE=C;LC_MEASUREMENT=en_US.UTF-8;LC_IDENTIFICATION=C
>>
>> attached base packages:
>> [1] splines   tools     stats     graphics  grDevices utils      
>> datasets
>> [8] methods   base
>>
>> other attached packages:
>> [1] makePlatformDesign_1.4.0
>> [2] affyio_1.8.0
>> [3] pd.hs.prompr.v01.3.ncbiv36.nr.harvard_1.4.0
>> [4] oligo_1.4.0
>> [5] oligoClasses_1.2.0
>> [6] AnnotationDbi_1.2.0
>> [7] preprocessCore_1.2.0
>> [8] RSQLite_0.6-9
>> [9] DBI_0.2-4
>> [10] Biobase_2.0.1
>> [11] affxparser_1.12.2
>>
>>
>> ------------------------------
>> Mark Robinson
>> Epigenetics Laboratory, Garvan
>> Bioinformatics Division, WEHI
>> e: m.robinson at garvan.org.au
>> e: mrobinson at wehi.edu.au
>> p: +61 (0)3 9345 2628
>> f: +61 (0)3 9347 0852
>>
>> _______________________________________________
>> Bioconductor mailing list
>> Bioconductor at stat.math.ethz.ch
>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
>

------------------------------
Mark Robinson
Epigenetics Laboratory, Garvan
Bioinformatics Division, WEHI
e: m.robinson at garvan.org.au
e: mrobinson at wehi.edu.au
p: +61 (0)3 9345 2628
f: +61 (0)3 9347 0852



More information about the Bioconductor mailing list