[BioC] How to generate an annotation library without CDF file?

Benilton Carvalho beniltoncarvalho at gmail.com
Mon Apr 16 15:59:05 CEST 2012


Hi Shu-wen,

I'm moving this back to the mailing list, so everyone can benefit from
this discussion and even provide you with alternatives.

Regarding the probeset.csv file, I'd expect Affymetrix to give you
this file. You should contact them with this regard.

benilton

On 15 April 2012 04:13, Shu-wen Huang <shuang at chromatininc.com> wrote:
> In order to run makePdInfoPackage, it requires 3 files, PGF, CLF, and probeset.csv. However, among the giving files, I don't have any .probeset.csv. Can any of the files below replace it?
>
> Here are all the files came with the CEL files.
> Can any other file, such as bgp, cif, grc, mps, gcc, smd  replace it?
>
>
> I tried to reformat .bgp to .probeset.csv. After the commands below, I received a failure message in the bottom.
>
>>library(pdInfoBuilder)
>>baseDir <- "/home/shuang/Analysis/R/dataset_20120413"
>>(pgf <- list.files(baseDir, pattern = ".pgf",full.names = TRUE))
>>(clf <- list.files(baseDir, pattern = ".clf", full.names = TRUE))
>>(prob <- list.files(baseDir, pattern = ".probeset.csv", full.names = TRUE))
>>seed <- new("AffyGenePDInfoPkgSeed",pgfFile = pgf, clfFile = clf, probeFile = prob, biocViews = "AnnotationData", organism = "Sorghum", species = "Bicolor")
>>makePdInfoPackage(seed, destDir = ".")
>
>
> Parsing file: Sorgh-WTa520972F.pgf... OK
> Parsing file: Sorgh-WTa520972F.clf... OK
> Creating initial table for probes... OK
> Creating dictionaries... OK
> Parsing file: Sorgh-WTa520972F.probeset.csv... OK
> Error in `[.data.frame`(probesets, , cols) : undefined columns selected
> In addition: Warning messages:
> 1: In is.na(x) : is.na() applied to non-(list or vector) of type 'NULL'
> 2: In is.na(x) : is.na() applied to non-(list or vector) of type 'NULL'
>
>
>
> -----Original Message-----
> From: Benilton Carvalho [mailto:beniltoncarvalho at gmail.com]
> Sent: Saturday, April 14, 2012 8:22 PM
> To: Shu-wen Huang
> Cc: bioconductor at r-project.org
> Subject: Re: [BioC] How to generate an annotation library without CDF file?
>
> You did misunderstand.
>
> 1) Get all your files
> 2) Install the pdInfoBuilder package
> 3) Use the example in Section 8 of the pdInfoBuilder vignette ( http://bioconductor.org/packages/release/bioc/vignettes/pdInfoBuilder/inst/doc/BuildingPDInfoPkgs.pdf
> )
> 4) Install the resulting annotation package
> 5) Install oligo
> 6) Use the Sections 1 and 4 of the document I suggested on my first message.
>
> b
>
> On 15 April 2012 02:16, Shu-wen Huang <shuang at chromatininc.com> wrote:
>> I tried to use rma() shown below. However, it seems I can't go around the need of sorghwta520972fcdf. Or did I misunderstand what you suggested?
>>
>>>eset = rma(dat)
>>
>> Error in getCdfInfo(object) :
>>  Could not obtain CDF environment, problems encountered:
>> Specified environment does not contain Sorgh-WTa520972F Library -
>> package sorghwta520972fcdf not installed Bioconductor -
>> sorghwta520972fcdf not available
>>
>>
>> Sw
>>
>>
>> -----Original Message-----
>> From: Benilton Carvalho [mailto:beniltoncarvalho at gmail.com]
>> Sent: Saturday, April 14, 2012 8:08 PM
>> To: Shu-wen Huang
>> Cc: bioconductor at r-project.org
>> Subject: Re: [BioC] How to generate an annotation library without CDF file?
>>
>> With the files you current have, you could generate the appropriate annotation package and work with the preprocessing steps through oligo and shown on the sections of the document I suggested initially.
>> However, I'm not sure gcrma() would work with oligo objects - in the meantime, you could use rma(). Maybe Jean can provide further insight...
>>
>> b
>>
>> On 15 April 2012 01:55, Shu-wen Huang <shuang at chromatininc.com> wrote:
>>> Below are my codes. It seems I need to somehow generate Sorgh-WTa520972F Library in order to do Normalization. However, I don't have CDF file, but many other format files.
>>>
>>>
>>>>library(affy)
>>>>library(limma)
>>>>library(gcrma)
>>>>library(genefilter)
>>>
>>> ## read the Targets.txt file ##
>>>>setwd("all")
>>>>targets = readTargets()
>>>
>>> ## create a phenodata object and attach it to the data ##
>>>>myCovs = data.frame(targets)
>>>>rownames(myCovs) = myCovs[,1]
>>>>nlev = as.numeric(apply(myCovs, 2, function(x)
>>>>nlevels(as.factor(x)))) metadata = data.frame(labelDescription =
>>>>paste(colnames(myCovs), ": ", nlev, " level", ifelse(nlev==1,"","s"),
>>>>sep=""),
>>>>>row.names=colnames(myCovs)) phenoData = new("AnnotatedDataFrame",
>>>>data=myCovs, varMetadata=metadata)
>>>
>>> ## read the data, attach the phenodata and normalize it using gcRMA
>>> ##
>>>>dat = ReadAffy(sampleNames = myCovs$Name, filenames = myCovs$Celfile,
>>>>phenoData = phenoData, celfile.path = "celfiles") eset = gcrma(dat,
>>>>verbose = FALSE)
>>>
>>>
>>>
>>> ############ error messages received ############ Error in
>>> getCdfInfo(object) :
>>>  Could not obtain CDF environment, problems encountered:
>>> Specified environment does not contain Sorgh-WTa520972F Library -
>>> package sorghwta520972fcdf not installed Bioconductor -
>>> sorghwta520972fcdf not available
>>>
>>>
>>>
>>>
>>>
>>> -----Original Message-----
>>> From: Benilton Carvalho [mailto:beniltoncarvalho at gmail.com]
>>> Sent: Saturday, April 14, 2012 7:48 PM
>>> To: Shu-wen Huang
>>> Cc: bioconductor at r-project.org
>>> Subject: Re: [BioC] How to generate an annotation library without CDF file?
>>>
>>> To generate an annotation package, you should use the PGF file... and one alternative for this is the pdInfoBuilder package... but without further details, it's hard to go on...
>>>
>>> benilton
>>>
>>> On 15 April 2012 01:40, Shu-wen Huang <shuang at chromatininc.com> wrote:
>>>> Hi benilton,
>>>>
>>>> Our group generated a particular list of probes. It's not available in BioConductor. Do you mean I should try to generate a library from PGF file? Thanks!
>>>>
>>>>
>>>> Sw
>>>>
>>>> -----Original Message-----
>>>> From: Benilton Carvalho [mailto:beniltoncarvalho at gmail.com]
>>>> Sent: Saturday, April 14, 2012 6:29 PM
>>>> To: Shu-wen Huang
>>>> Cc: bioconductor at r-project.org
>>>> Subject: Re: [BioC] How to generate an annotation library without CDF file?
>>>>
>>>> PGFs are given for Gene/Exon ST arrays... and chances are that the
>>>> package you need is already on BioConductor. (btw, a CDF for such
>>>> array design is not recommended by Affymetrix themselves)
>>>>
>>>> Check Sections 1 and 4 of the document below:
>>>>
>>>> http://bioconductor.org/packages/release/bioc/vignettes/oligo/inst/d
>>>> o
>>>> c
>>>> /primer.pdf
>>>>
>>>> benilton



More information about the Bioconductor mailing list