[BioC] makePDPackage problem? -- Affy promoter arrays

Mark Robinson mrobinson at wehi.EDU.AU
Tue Aug 5 01:20:17 CEST 2008


Thanks again Benilton.

Is there something about the BPMAP files for the Promoter Tiling  
array?  I read these files regularly using 'readBpmap' from the  
affxparser package.  See below.  With pdInfoBuilder, I get an error of  
the sort:

-------
Unable to read file: Hs_PromPR_v02-3_NCBIv36.affy.bpmap, is it a BPMAP  
file?
-------
Unable to read file: Hs_PromPR_v01-3_NCBIv36.NR.harvard.bpmap, is it a  
BPMAP file?
-------

(I've installed pdInfoBuilder v1.5.1 (i.e. BioC 2.3) onto R 2.7.1,  
since I don't have easy access to 2.8 ... though from the error  
message below, it appears that its the BPMAP file)

I've posted the BPMAP/CIF files I am using at:
http://bioinf.wehi.edu.au/folders/mrobinson/BPMAP/

Cheers,
Mark




 > library(pdInfoBuilder)
Loading required package: Biobase
Loading required package: tools

Welcome to Bioconductor

   Vignettes contain introductory material. To view, type
   'openVignette()'. To cite Bioconductor, see
   'citation("Biobase")' and for packages 'citation(pkgname)'.

Loading required package: RSQLite
Loading required package: DBI
Loading required package: affxparser
Loading required package: oligo
Loading required package: splines
Loading required package: preprocessCore
Loading required package: AnnotationDbi
Loading required package: oligoClasses
This is the oligo package
 > #bpmapFile <- "Hs_PromPR_v01-3_NCBIv36.NR.harvard.bpmap"
 > bpmapFile <- "Hs_PromPR_v02-3_NCBIv36.affy.bpmap"
 > cifFile <- "Hs_PromPR_v02.cif"
 > obj <- new("AffyTilingPDInfoPkgSeed",
+           version="0.0.1",
+           author="Mark Robinson", email="mrobinson at ...",
+           biocViews="AnnotationData",
+           genomebuild="NCBI Build 36",
+           bpmapFile=bpmapFile,
+           cifFile=cifFile)
 > makePdInfoPackage(obj, destDir=".")
Creating package in ./pd.hs.prompr.v02.3.ncbiv36.affy
Processing unit 53 out of 86.
Processing unit 54 out of 86.
Processing unit 55 out of 86.
Processing unit 56 out of 86.
Processing unit 57 out of 86.
Processing unit 58 out of 86.
Processing unit 59 out of 86.
Processing unit 60 out of 86.
Processing unit 61 out of 86.
Processing unit 62 out of 86.
Processing unit 63 out of 86.
Processing unit 64 out of 86.
Processing unit 65 out of 86.
Processing unit 66 out of 86.
Processing unit 67 out of 86.
Processing unit 68 out of 86.
Processing unit 69 out of 86.
Processing unit 70 out of 86.
Processing unit 71 out of 86.
Processing unit 72 out of 86.
Processing unit 73 out of 86.
Processing unit 74 out of 86.
Processing unit 75 out of 86.
Processing unit 76 out of 86.
Processing unit 77 out of 86.
Processing unit 78 out of 86.
Processing unit 79 out of 86.
Unable to read file: Hs_PromPR_v02-3_NCBIv36.affy.bpmap, is it a BPMAP  
file?
Error in function (classes, fdef, mtable)  :
   unable to find an inherited method for function  
"dbGetPreparedQuery", for signature "SQLiteConnection", "character",  
"NULL"
Timing stopped at: 543.75 192.636 9180.075
 > library(pdInfoBuilder)
 > bpmapFile <- "Hs_PromPR_v01-3_NCBIv36.NR.harvard.bpmap"
 > #bpmapFile <- "Hs_PromPR_v02-3_NCBIv36.affy.bpmap"
 > cifFile <- "Hs_PromPR_v02.cif"
 > obj <- new("AffyTilingPDInfoPkgSeed",
+           version="0.0.1",
+           author="Mark Robinson", email="mrobinson at ...",
+           biocViews="AnnotationData",
+           genomebuild="NCBI Build 36",
+           bpmapFile=bpmapFile,
+           cifFile=cifFile)
 > makePdInfoPackage(obj, destDir=".")
Creating package in ./pd.hs.prompr.v01.3.ncbiv36.nr.harvard
Unable to read file: Hs_PromPR_v01-3_NCBIv36.NR.harvard.bpmap, is it a  
BPMAP file?
Processing unit 1 out of 0.
Unable to read file: Hs_PromPR_v01-3_NCBIv36.NR.harvard.bpmap, is it a  
BPMAP file?
Error in function (classes, fdef, mtable)  :
   unable to find an inherited method for function  
"dbGetPreparedQuery", for signature "SQLiteConnection", "character",  
"NULL"
Timing stopped at: 0.014 0.002 0.042
 > sessionInfo()
R version 2.7.1 (2008-06-23)
i386-apple-darwin8.10.1

locale:
en_AU.UTF-8/en_AU.UTF-8/C/C/en_AU.UTF-8/en_AU.UTF-8

attached base packages:
[1] splines   tools     stats     graphics  grDevices utils     datasets
[8] methods   base

other attached packages:
[1] pdInfoBuilder_1.5.1  oligo_1.4.0          oligoClasses_1.2.0
[4] AnnotationDbi_1.2.2  preprocessCore_1.2.0 affxparser_1.12.2
[7] RSQLite_0.6-9        DBI_0.2-4            Biobase_2.1.7




On 05/08/2008, at 1:09 AM, Benilton Carvalho wrote:

> Dear Mark,
>
> I recall fixing this problem before, but I can't remember if it was  
> for Tiling or another platform. Could you please try the devel  
> version?
>
> http://www.bioconductor.org/packages/2.3/bioc/html/pdInfoBuilder.html
>
> Thank you very much,
>
> b
>
> On Aug 4, 2008, at 12:48 AM, Mark Robinson wrote:
>
>> Thanks Benilton.
>>
>> Next problem is then:
>>
>> > bpmapFile <- "Hs_PromPR_v02-3_NCBIv36.affy.bpmap"
>> > cifFile <- "Hs_PromPR_v02.cif"
>> > obj <- new("AffyTilingPDInfoPkgSeed",
>> +           version="0.0.1",
>> +           author="Mark Robinson", email="mrobinson at ...",
>> +           biocViews="AnnotationData",
>> +           genomebuild="NCBI Build 36",
>> +           bpmapFile=bpmapFile,
>> +           cifFile=cifFile)
>> > makePdInfoPackage(obj, destDir=".")
>> Creating package in ./pd.hs.prompr.v02.3.ncbiv36.affy
>> Error in sqliteExecStatement(conn, statement, bind.data, ...) :
>> RS-DBI driver: (RS_SQLite_exec: could not execute: PRIMARY KEY must  
>> be unique)
>> Timing stopped at: 4.013 0.036 4.095
>>
>> > traceback()
>> 14: .Call("RS_SQLite_exec", conId, statement, bind.data, PACKAGE  
>> = .SQLitePkgName)
>> 13: sqliteExecStatement(conn, statement, bind.data, ...)
>> 12: is(object, Cl)
>> 11: is(object, Cl)
>> 10: .valueClassTest(standardGeneric("dbSendPreparedQuery"),  
>> "DBIResult",
>>       "dbSendPreparedQuery")
>> 9: dbSendPreparedQuery(db, sql, batchMat[!isPm, ])
>> 8: loadUnits.affyTiling(db, qcunits, nx = nx, isQc = TRUE)
>> 7: loadUnitsByBatch.affyTiling(db, bpmapFile, batch_size =  
>> batch_size,
>>      nx = nx)
>> 6: eval(expr, envir, enclos)
>> 5: eval(expr, envir = loc.frame)
>> 4: ST(loadUnitsByBatch.affyTiling(db, bpmapFile, batch_size =  
>> batch_size,
>>      nx = nx))
>> 3: buildPdInfoDb.affyTiling(object at bpmapFile, object at cifFile,  
>> dbFilePath,
>>      seqMatFile, batch_size = batch_size, verbose = !quiet)
>> 2: makePdInfoPackage(obj, destDir = ".")
>> 1: makePdInfoPackage(obj, destDir = ".")
>> > sessionInfo()
>> R version 2.7.0 (2008-04-22)
>> x86_64-unknown-linux-gnu
>>
>> locale:
>> LC_CTYPE 
>> = 
>> en_US 
>> .UTF 
>> -8 
>> ;LC_NUMERIC 
>> = 
>> C 
>> ;LC_TIME 
>> = 
>> en_US 
>> .UTF 
>> -8 
>> ;LC_COLLATE 
>> = 
>> en_US 
>> .UTF 
>> -8 
>> ;LC_MONETARY 
>> = 
>> C 
>> ;LC_MESSAGES 
>> = 
>> en_US 
>> .UTF 
>> -8 
>> ;LC_PAPER 
>> = 
>> en_US 
>> .UTF 
>> -8 
>> ;LC_NAME 
>> = 
>> C 
>> ;LC_ADDRESS 
>> =C;LC_TELEPHONE=C;LC_MEASUREMENT=en_US.UTF-8;LC_IDENTIFICATION=C
>>
>> attached base packages:
>> [1] splines   tools     stats     graphics  grDevices utils      
>> datasets
>> [8] methods   base
>>
>> other attached packages:
>> [1] pdInfoBuilder_1.4.0  oligo_1.4.0          oligoClasses_1.2.0
>> [4] AnnotationDbi_1.2.0  preprocessCore_1.2.0 affxparser_1.12.2
>> [7] RSQLite_0.6-9        DBI_0.2-4            Biobase_2.0.1
>>
>> It seems others have had this problem, but I couldn't find a  
>> response:
>>
>> http://article.gmane.org/gmane.science.biology.informatics.conductor/18440/
>>
>>
>> Cheers,
>> Mark
>>
>>
>>
>>
>> On 04/08/2008, at 1:10 PM, Benilton Carvalho wrote:
>>
>>> Dear Mark,
>>>
>>> your finding is due to the fact you didn't provide a CIF file.
>>>
>>> But, anyways, I strongly recommend you to use the pdInfoBuilder  
>>> package instead as the following message suggests:
>>>
>>> http://article.gmane.org/gmane.science.biology.informatics.conductor/19140
>>>
>>> best regards,
>>>
>>> b
>>>
>>>
>>> On Aug 3, 2008, at 9:02 PM, Mark Robinson wrote:
>>>
>>>> Hi all.
>>>>
>>>> This may in fact actually not be a problem, it may be something  
>>>> silly that I'm doing.  But, something strikes me as odd.  Below  
>>>> is my explanation.
>>>>
>>>>
>>>> I am working with the Affymetrix promoter tiling arrays.  My  
>>>> starting point is a BPMAP file, which you can get from Affy  
>>>> library bundle at:
>>>>
>>>> http://www.affymetrix.com/products/arrays/specific/human_promoter.affx
>>>>
>>>> ... or I have also been using the re-worked BPMAP file you can  
>>>> get from the people who developed MAT:
>>>>
>>>> http://chip.dfci.harvard.edu/~wli/MAT/Download.htm
>>>>
>>>> So, I use 'makePDPackage' to create a R package, along the lines  
>>>> of:
>>>>
>>>> (I've just renamed the downloaded BPMAP file to have 'affy' or  
>>>> 'harvard' in the name of the file, so that I can remember how to  
>>>> tell them apart)
>>>>
>>>> >  
>>>> makePDpackage 
>>>> ("Hs_PromPR_v02 
>>>> -3_NCBIv36 
>>>> .affy.bpmap",type="tiling",manufacturer="affymetrix",genome="hg18")
>>>> affymetrix tiling
>>>> The package will be called pd.hs.prompr.v02.3.ncbiv36.affy
>>>> Array identified as having 914 rows and 914 columns.
>>>> Creating package in /export/share/disk501/lab0605/mrobinson/ 
>>>> projects/microarray/pd.hs.prompr.v02.3.ncbiv36.affy
>>>>
>>>> >  
>>>> makePDpackage 
>>>> ("Hs_PromPR_v01 
>>>> -3_NCBIv36 
>>>> .NR 
>>>> .harvard 
>>>> .bpmap",type="tiling",manufacturer="affymetrix",genome="hg18")
>>>> affymetrix tiling
>>>> The package will be called pd.hs.prompr.v01.3.ncbiv36.nr.harvard
>>>> Array identified as having 914 rows and 914 columns.
>>>> Creating package in /export/share/disk501/lab0605/mrobinson/ 
>>>> projects/microarray/pd.hs.prompr.v01.3.ncbiv36.nr.harvard
>>>>
>>>> ... then do R CMD INSTALL ... from the command prompt.  One thing  
>>>> that strikes me as odd is the fact that it recognizes it as  
>>>> having 914 rows and columns.  See below.
>>>>
>>>> So, I read in the data for a single file and look at the raw data  
>>>> for a particular X and Y location on the chip.  And compare this  
>>>> to what I get from 'readCel' in the affxparser package.
>>>>
>>>>
>>>> > rd<-read.celfiles("CEL/ 
>>>> test1.CEL",pkgname="pd.hs.prompr.v01.3.ncbiv36.nr.harvard")
>>>> Platform design info loaded.
>>>> The intensity matrix will require 35.79 MB of RAM.
>>>> > pd<-getPD(rd)
>>>> > length(pd$X)
>>>> [1] 4286817
>>>> > dim(rd)
>>>> Features  Samples
>>>> 4286817        1
>>>> > w<-which(pd$X==1344 & pd$Y==854)
>>>> > w
>>>> [1] 1267129
>>>> > exprs(rd)[w,]
>>>> [1] 123
>>>>
>>>>
>>>> > library(affxparser)
>>>> > x<-readCel("CEL/test1.CEL",readXY=TRUE)
>>>> > x$header[c("rows","cols")]
>>>> $rows
>>>> [1] 2166
>>>>
>>>> $cols
>>>> [1] 2166
>>>> > w<-which(x$x==1344 & x$y==854)
>>>> > w
>>>> [1] 1851109
>>>> > x$intensities[w]
>>>> [1] 8074
>>>>
>>>> So, this chip does have 2166 rows and columns, which could be  
>>>> introducing problems in the indexing.  I haven't dug any deeper  
>>>> on this.
>>>>
>>>> Anyone know what is happening?  Is this a problem in making the  
>>>> package through 'makePDPackage', or do I misunderstand the  
>>>> correspondence between the elements of a 'TilingFeatureSet' and  
>>>> the corresponding 'platformDesign' object?
>>>>
>>>> Thanks!
>>>> Mark
>>>>
>>>> > sessionInfo()
>>>> R version 2.7.0 (2008-04-22)
>>>> x86_64-unknown-linux-gnu
>>>>
>>>> locale:
>>>> LC_CTYPE 
>>>> = 
>>>> en_US 
>>>> .UTF 
>>>> -8 
>>>> ;LC_NUMERIC 
>>>> = 
>>>> C 
>>>> ;LC_TIME 
>>>> = 
>>>> en_US 
>>>> .UTF 
>>>> -8 
>>>> ;LC_COLLATE 
>>>> = 
>>>> en_US 
>>>> .UTF 
>>>> -8 
>>>> ;LC_MONETARY 
>>>> = 
>>>> C 
>>>> ;LC_MESSAGES 
>>>> = 
>>>> en_US 
>>>> .UTF 
>>>> -8 
>>>> ;LC_PAPER 
>>>> = 
>>>> en_US 
>>>> .UTF 
>>>> -8 
>>>> ;LC_NAME 
>>>> = 
>>>> C 
>>>> ;LC_ADDRESS 
>>>> =C;LC_TELEPHONE=C;LC_MEASUREMENT=en_US.UTF-8;LC_IDENTIFICATION=C
>>>>
>>>> attached base packages:
>>>> [1] splines   tools     stats     graphics  grDevices utils      
>>>> datasets
>>>> [8] methods   base
>>>>
>>>> other attached packages:
>>>> [1] makePlatformDesign_1.4.0
>>>> [2] affyio_1.8.0
>>>> [3] pd.hs.prompr.v01.3.ncbiv36.nr.harvard_1.4.0
>>>> [4] oligo_1.4.0
>>>> [5] oligoClasses_1.2.0
>>>> [6] AnnotationDbi_1.2.0
>>>> [7] preprocessCore_1.2.0
>>>> [8] RSQLite_0.6-9
>>>> [9] DBI_0.2-4
>>>> [10] Biobase_2.0.1
>>>> [11] affxparser_1.12.2
>>>>
>>>>
>>>> ------------------------------
>>>> Mark Robinson
>>>> Epigenetics Laboratory, Garvan
>>>> Bioinformatics Division, WEHI
>>>> e: m.robinson at garvan.org.au
>>>> e: mrobinson at wehi.edu.au
>>>> p: +61 (0)3 9345 2628
>>>> f: +61 (0)3 9347 0852
>>>>
>>>> _______________________________________________
>>>> Bioconductor mailing list
>>>> Bioconductor at stat.math.ethz.ch
>>>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>>>> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
>>>
>>
>> ------------------------------
>> Mark Robinson
>> Epigenetics Laboratory, Garvan
>> Bioinformatics Division, WEHI
>> e: m.robinson at garvan.org.au
>> e: mrobinson at wehi.edu.au
>> p: +61 (0)3 9345 2628
>> f: +61 (0)3 9347 0852
>> ------------------------------
>>
>>
>>
>>
>

------------------------------
Mark Robinson
Epigenetics Laboratory, Garvan
Bioinformatics Division, WEHI
e: m.robinson at garvan.org.au
e: mrobinson at wehi.edu.au
p: +61 (0)3 9345 2628
f: +61 (0)3 9347 0852



More information about the Bioconductor mailing list