[BioC] ChIPpeakAnno as.data.frame error

Mark Cowley m.cowley at garvan.org.au
Tue Mar 6 23:57:38 CET 2012


Dear Julie,
I already had ChIPpeakAnno in the Depends section of DESCRIPTION. here's my search() output after loading my 'macsR' package & thus the state of the search() when my function gets run:
> search()
 [1] ".GlobalEnv"                          
 [2] "package:macsR"                       
 [3] "package:ChIPpeakAnno"                
 [4] "package:limma"                       
 [5] "package:org.Hs.eg.db"                
 [6] "package:GO.db"                       
 [7] "package:RSQLite"                     
 [8] "package:DBI"                         
 [9] "package:AnnotationDbi"               
[10] "package:BSgenome.Ecoli.NCBI.20080805"
[11] "package:BSgenome"                    
[12] "package:GenomicRanges"               
[13] "package:Biostrings"                  
[14] "package:IRanges"                     
[15] "package:multtest"                    
[16] "package:Biobase"                     
[17] "package:biomaRt"                     
[18] "package:plyr"                        
[19] "package:stats"                       
[20] "package:graphics"                    
[21] "package:grDevices"                   
[22] "package:utils"                       
[23] "package:datasets"                    
[24] "package:methods"                     
[25] "Autoloads"                           
[26] "package:base"                     

I did add another library(ChIPpeakAnno) command in the relevant function, but since it was already loaded, there's no effect.
The error is due to not finding the S4 method defined in IRanges. I tried moving IRanges up the search() list, but it's a dependency of too many packaged to be moved.

I'll upgrade R (I've been putting this off for too long) and will report back

cheers,
Mark

On 07/03/2012, at 2:37 AM, Zhu, Lihua (Julie) wrote:

> Dear Mark,
> 
> Sorry that I meant adding library(ChIPpeakAnno) after importing all other
> packages. Also, I noticed that you are using an old version of ChIPpeakAnno.
> I am wondering what happens if you update your R and ChIPpeakAnno.
> 
> Best regards,
> 
> Julie
> 
> On 3/6/12 9:49 AM, "Julie Zhu" <julie.zhu at umassmed.edu> wrote:
> 
>> Dear Mark,
>> 
>> I am wondering what happens if you add library(RangedData) after all the
>> packages have been imported.
>> 
>> Best regards,
>> 
>> Julie
>> 
>> 
>> On 3/5/12 10:31 PM, "Mark Cowley" <m.cowley at garvan.org.au> wrote:
>> 
>>> Dear list,
>>> i've had an odd error for a few days now, which i'm really struggling to
>>> pinpoint. 
>>> Essentially, i'm trying to annotate ChIP-Seq peaks using ChIPpeakAnno. after
>>> doing annotatePeakInBatch, I convert to data.frame & I get this error:
>>> 
>>>> peaks.bed <- read.delim(bed.file, header=F)
>>>> peaks.RangedData <- BED2RangedData(peaks.bed, FALSE)
>>>> peaks.annot <- annotatePeakInBatch(peaks.RangedData, AnnotationData =
>>>> annoData, output="both", multiple=TRUE, maxgap=0)
>>>> peaks.annot <- as.data.frame(peaks.annot)
>>> Error in as.data.frame.default(peaks.annot) :
>>>  cannot coerce class 'structure("RangedData", package = "IRanges")' into a
>>> data.frame
>>> 
>>> A combination of debug, getMethod & traceback (see below) prove that my
>>> peaks.annot object is of class RangedData, the S4 method "as.data.frame" for
>>> "RangedData" exists, but is not actually called; instead
>>> as.data.frame.default
>>> is called, causing the error.
>>> Oddly enough, the error only occurs when run inside my function, which i
>>> think
>>> implies that my package NAMESPACE is incorrectly setup. To this end, I put
>>> this code in its own package, deliberately do not import any code from my
>>> other [potentially dodgy] packages & I still get this error.
>>> 
>>> My code & the debug trace/getMethod/traceback is below & any advice would be
>>> greatly recommended.
>>> cheers,
>>> Mark
>>> 
>>> 
>>> 
> #############################################################################>>
> #
>>> ##
>>> The function:
>>> #' annotate MACS peaks using ChIPpeakAnno
>>> #' 
>>> #' This compares the BED regions from running MACS to the nearest ENSEMBL
>>> #' Gene, and to CpG islands. This uses
>>> \code{\link[ChIPpeakAnno]{annotatePeakInBatch}}
>>> #' from ChIPpeakAnno, which has a very loose definition of close, ie
>>> #' within 0.5Mb.
>>> #' 
>>> #' This uses pre-built CpG island definitions downloaded from UCSC.
>>> #'
>>> #' @param dir the path to a MACS result directory
>>> #' @param genome one of \sQuote{hg18}, or \sQuote{hg19}
>>> #' @return nothing. it writes a <prefix>_peaks_annot.xls file. To interpret
>>> #' the new columns, see \code{\link[ChIPpeakAnno]{annotatePeakInBatch}}
>>> #' @author Mark Cowley, 2012-03-01
>>> #' @importFrom ChIPpeakAnno BED2RangedData annotatePeakInBatch
>>> #' @importFrom plyr join
>>> #' @export
>>> #' @examples
>>> #' \dontrun{
>>> #' dir <- "./MACS/TAMR.H3k4Me3.vs.MCF7.H3k4Me3/"
>>> #' macs.ChIPpeakAnno(dir, "hg19")
>>> #' }
>>> annotate.macs.output <- function(dir, genome=c("hg19", "hg18", "mm9", "rn4"),
>>> tss=TRUE, cpg=TRUE) {
>>> length(dir) == 1 && is.dir(dir) || stop("dir must be the path to a macs
>>> result
>>> directory.")
>>> SUPPORTED.GENOMES <- c("hg19", "hg18")
>>> genome <- genome[1]
>>> genome %in% SUPPORTED.GENOMES || stop("genome is unsupported.")
>>> 
> #############################################################################>>
> #
>>> ##
>>> saf <- getOption("stringsAsFactors")
>>> on.exit(options(stringsAsFactors=saf))
>>> options(stringsAsFactors=FALSE)
>>> 
> #############################################################################>>
> #
>>> ##
>>> 
>>> 
> #############################################################################>>
> #
>>> ##
>>> # which genome?
>>> annoData <- switch(genome,
>>> hg19={data(TSS.human.GRCh37);  TSS.human.GRCh37},
>>> hg18={data(TSS.human.NCBI36);  TSS.human.NCBI36},
>>> mm9={data(TSS.mouse.NCBIM37); TSS.mouse.NCBIM37},
>>> rn4={data(TSS.rat.RGSC3.4);   TSS.rat.RGSC3.4},
>>> stop("unsupported genome")
>>> )
>>> cpgData <- switch(genome,
>>> hg19={data(CpG.human.GRCh37);  CpG.human.GRCh37},
>>> hg18={data(CpG.human.NCBI36);  CpG.human.NCBI36},
>>> mm9={data(CpG.mouse.NCBIM37); CpG.mouse.NCBIM37},
>>> rn4={data(CpG.rat.RGSC3.4);   CpG.rat.RGSC3.4},
>>> stop("unsupported genome")
>>> )
>>> 
> #############################################################################>>
> #
>>> ##
>>> 
>>> 
> #############################################################################>>
> #
>>> ##
>>> # setup the input/output file paths
>>> # bed.file <- "TAMR.vs.MCF7.MBD2_peaks.bed"
>>> bed.file <- dir(dir, pattern="_peaks.bed", full=TRUE)
>>> file.exists(bed.file) || stop("bed.file must exist")
>>> 
>>> # macs.peaks.file <- "TAMR.vs.MCF7.MBD2_peaks.xls"
>>> macs.peaks.file <- rev(dir(dir, pattern="_peaks.xls", full=TRUE))[1]
>>> file.exists(macs.peaks.file) || stop("macs.peaks.file must exist")
>>> 
>>> result.file <- sub("_peaks.xls", "_peaks_annot.xls", macs.peaks.file)
>>> 
> #############################################################################>>
> #
>>> ##
>>> 
>>> 
> #############################################################################>>
> #
>>> ##
>>> # import peaks
>>> peaks.bed <- read.delim(bed.file, header=F)
>>> # head(peaks.bed)
>>> 
>>> peaks.RangedData <- BED2RangedData(peaks.bed, FALSE)
>>> # peaks.RangedData
>>> 
> #############################################################################>>
> #
>>> ##
>>> 
>>> #
>>> # import peaks & join to closest gene and CpG island
>>> # 
>>> peaks.stats <- import.macs.peaks.file(macs.peaks.file)
>>> # head(peaks.stats)
>>> 
>>> res <- peaks.stats
>>> 
>>> 
> #############################################################################>>
> #
>>> ##
>>> message("Annotating peaks to nearest ENSG TSS...")
>>> if( tss ) {
>>> peaks.annot <- annotatePeakInBatch(peaks.RangedData, AnnotationData =
>>> annoData, output="both", multiple=TRUE, maxgap=0)
>>> message("done")
>>> 
>>> peaks.annot <- as.data.frame(peaks.annot)
>>> peaks.annot <- peaks.annot[,-match(c("space", "start", "end", "width",
>>> "names"), colnames(peaks.annot))]
>>> library(org.Hs.eg.db)
>>> peaks.annot$GeneSymbol <- mget.chain(as.character(peaks.annot$feature),
>>> org.Hs.egENSEMBL2EG, org.Hs.egSYMBOL)
>>> peaks.annot <- rename.column(peaks.annot, "feature", "Ensembl ID")
>>> peaks.annot <- rename.column(peaks.annot, "peak", "name")
>>> peaks.annot <- move.column(peaks.annot, "strand", "insideFeature")
>>> peaks.annot <- move.column(peaks.annot, "GeneSymbol", "start_position")
>>> 
>>> res <- plyr::join(peaks.stats, peaks.annot, by="name", type="full")
>>> }
>>> else {
>>> message("skipping")
>>> }
>>> 
> #############################################################################>>
> #
>>> ##
>>> 
>>> 
> #############################################################################>>
> #
>>> ##
>>> message("Annotating peaks to nearest CpG...")
>>> if( cpg ) {
>>> peaks.annot.cpg <- annotatePeakInBatch(peaks.RangedData, AnnotationData =
>>> cpgData, output="both", multiple=T, maxgap=0)
>>> message("done")
>>> 
>>> peaks.annot.cpg <- as.data.frame(peaks.annot.cpg)
>>> peaks.annot.cpg <- peaks.annot.cpg[,-match(c("space", "start", "end",
>>> "width",
>>> "names"), colnames(peaks.annot.cpg))]
>>> library(org.Hs.eg.db)
>>> peaks.annot.cpg <- rename.column(peaks.annot.cpg, "feature", "CpG island")
>>> peaks.annot.cpg <- rename.column(peaks.annot.cpg, "peak", "name")
>>> peaks.annot.cpg <- move.column(peaks.annot.cpg, "strand", "insideFeature")
>>> 
>>> res <- plyr::join(res, peaks.annot.cpg, by="name", type="full")
>>> }
>>> else {
>>> message("skipping")
>>> }
>>> 
> #############################################################################>>
> #
>>> ##
>>> 
>>> write.xls(res, result.file)
>>> }
>>> 
> #############################################################################>>
> #
>>> ##
>>> 
>>> 
>>> 
> #############################################################################>>
> #
>>> ##
>>> Selection from Debug showing that the method is at least visible:
>>> ...
>>> Browse[2]> class(peaks.annot)
>>> [1] "RangedData"
>>> attr(,"package")
>>> [1] "IRanges"
>>> Browse[2]> getMethod("as.data.frame", "RangedData")
>>> Method Definition:
>>> 
>>> function (x, row.names = NULL, optional = FALSE, ...)
>>> {
>>>    if (!(is.null(row.names) || is.character(row.names)))
>>>        stop("'row.names'  must be NULL or a character vector")
>>>    if (!missing(optional) || length(list(...)))
>>>        warning("'optional' and arguments in '...' ignored")
>>>    data.frame(as.data.frame(ranges(x)), as.data.frame(values(x))[-1L],
>>>        row.names = row.names, stringsAsFactors = FALSE)
>>> }
>>> <environment: namespace:IRanges>
>>> 
>>> Signatures:
>>>        x       
>>> target  "RangedData"
>>> defined "RangedData"
>>> Browse[2]> 
>>> Error in as.data.frame.default(peaks.annot) :
>>>  cannot coerce class 'structure("RangedData", package = "IRanges")' into a
>>> data.frame
>>> traceback()
>>> 4: stop(gettextf("cannot coerce class '%s' into a data.frame",
>>> deparse(class(x))),
>>>       domain = NA)
>>> 3: as.data.frame.default(peaks.annot)
>>> 2: as.data.frame(peaks.annot)
>>> 1: annotate.macs.output("MACS/TAMR.MBD2.vs.MCF7.MBD2")
>>> 
> #############################################################################>>
> #
>>> ##
>>> 
>>> sessionInfo()
>>> R version 2.13.1 (2011-07-08)
>>> Platform: x86_64-apple-darwin9.8.0/x86_64 (64-bit)
>>> 
>>> locale:
>>> [1] en_AU.UTF-8/en_AU.UTF-8/C/C/en_AU.UTF-8/en_AU.UTF-8
>>> 
>>> attached base packages:
>>> [1] stats     graphics  grDevices utils     datasets  methods   base
>>> 
>>> other attached packages:
>>> [1] gdata_2.8.1                         macsR_1.0
>>> [3] ChIPpeakAnno_1.9.0                  limma_3.8.3
>>> [5] org.Hs.eg.db_2.5.0                  GO.db_2.5.0
>>> [7] RSQLite_0.9-4                       DBI_0.2-5
>>> [9] AnnotationDbi_1.14.1                BSgenome.Ecoli.NCBI.20080805_1.3.17
>>> [11] BSgenome_1.20.0                     GenomicRanges_1.4.6
>>> [13] Biostrings_2.20.2                   IRanges_1.10.5
>>> [15] multtest_2.8.0                      Biobase_2.12.2
>>> [17] biomaRt_2.8.1                       plyr_1.6
>>> 
>>> loaded via a namespace (and not attached):
>>> [1] gtools_2.6.2    MASS_7.3-14     RCurl_1.6-7     splines_2.13.1
>>> [5] survival_2.36-9 tools_2.13.1    XML_3.4-2
>>> 
>>> _______________________________________________
>>> Bioconductor mailing list
>>> Bioconductor at r-project.org
>>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>>> Search the archives:
>>> http://news.gmane.org/gmane.science.biology.informatics.conductor
>> 
>> _______________________________________________
>> Bioconductor mailing list
>> Bioconductor at r-project.org
>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>> Search the archives:
>> http://news.gmane.org/gmane.science.biology.informatics.conductor
> 
> 



More information about the Bioconductor mailing list