[BioC] ChIPpeakAnno::getEnrichedGo crashes but I don't know why

Thu Jan 6 06:24:37 CET 2011

Julie,

In the end, I did a quick detour through Perl and getEnrichedGo worked 
as expected. Thanks for the help.

Eric

Eric Cabot wrote:
> Julie,
>
> I don't know how to do that in R. I guess I can always export the
> annotations, do the association in Perl and re-import them as a vector
> to use with getEnrichedGo.
>
> Eric
>
> Zhu, Lihua (Julie) wrote:
>> Eric,
>>
>> You could convert the exon IDs associated with the peaks to ensemble gene
>> IDs and input these ensemble gene IDs to the getEnrichedGO function.
>>
>> Best regards,
>>
>> Julie
>>
>>
>> On 1/5/11 4:09 PM, "Eric Cabot" <elcabot at gmail.com> wrote:
>>
>>> Hi Julie,
>>>
>>> It may be a while before I get back to you on this, because I did my
>>> mapping and ChIP-Seq analysis with Hg19 (NCBI 37), not Hg18 (NCBI 36).
>>> I'm also a little concerned about using transcription start site
>>> annotations rather than exons, because the the binding domains are not
>>> thought to be restricted to only promoters. Any suggestions?
>>>
>>> Eric
>>>
>>>
>>>
>>> Zhu, Lihua (Julie) wrote:
>>>> Eric,
>>>>
>>>> The annotated dataset has exon ID instead of gene ID while the
>>>> getEnrichedGO
>>>> is expecting feature_id_type="ensembl_gene_id". For a list of supported
>>>> feature_id_type, please type ?getEnrichedGO.
>>>>
>>>> To use getEnrichedGO function, first get the TSS annotation.
>>>>
>>>> TSS.human.NCBI36 = getAnnotation(ENSEMBLE_GENES_MART,
>>>> featureType="TSS")
>>>>
>>>> or use the build in TSS as
>>>>
>>>> data(TSS.human.NCBI36)
>>>>
>>>> Then annotate your peaks with TSS.human.NCBI36 followed by
>>>> getEnrichedGO
>>>> call.
>>>>
>>>> Please let me know if this works for you.
>>>>
>>>> Best regards,
>>>>
>>>> Julie
>>>>
>>>>
>>>>
>>>>
>>>> On 1/5/11 12:29 PM, "Eric Cabot" <elcabot at gmail.com> wrote:
>>>>
>>>>> Hi Julie,
>>>>>
>>>>> Thank you for your response.
>>>>>
>>>>> Here is the sessionInfo and traceback output and also a few lines of
>>>>> "my_annotated_regions".
>>>>>
>>>>> Regards,
>>>>>
>>>>> Eric Cabot
>>>>>
>>>>>> sessionInfo()
>>>>> R version 2.12.1 (2010-12-16)
>>>>> Platform: x86_64-unknown-linux-gnu (64-bit)
>>>>>
>>>>> locale:
>>>>> [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
>>>>> [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
>>>>> [5] LC_MONETARY=C LC_MESSAGES=en_US.UTF-8
>>>>> [7] LC_PAPER=en_US.UTF-8 LC_NAME=C
>>>>> [9] LC_ADDRESS=C LC_TELEPHONE=C
>>>>> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
>>>>>
>>>>> attached base packages:
>>>>> [1] stats graphics grDevices utils datasets methods base
>>>>>
>>>>> other attached packages:
>>>>> [1] ChIPpeakAnno_1.6.0 limma_3.6.9
>>>>> [3] org.Hs.eg.db_2.4.6 GO.db_2.4.5
>>>>> [5] RSQLite_0.9-4 DBI_0.2-5
>>>>> [7] AnnotationDbi_1.12.0
>>>>> BSgenome.Ecoli.NCBI.20080805_1.3.16
>>>>> [9] BSgenome_1.18.2 GenomicRanges_1.2.2
>>>>> [11] Biostrings_2.18.2 IRanges_1.8.8
>>>>> [13] multtest_2.6.0 Biobase_2.10.0
>>>>> [15] biomaRt_2.6.0
>>>>>
>>>>> loaded via a namespace (and not attached):
>>>>> [1] MASS_7.3-9 RCurl_1.5-0 splines_2.12.1 survival_2.36-2
>>>>> [5] tools_2.12.1 XML_3.2-0
>>>>> my_enrichedGO<-getEnrichedGO(my_annotated_regions,orgAnn="org.Hs.eg.db",maxP
>>>>>
>>>>> =0
>>>>> .01,multiAdj=FALSE,minGOterm=1,feature_id_type="ensembl_gene_id")
>>>>> Error in if (class(go.ids) != "matrix" | dim(go.ids)[2] < 4) { :
>>>>> argument is of length zero
>>>>>> traceback()
>>>>> 2: addAncestors(this.GO[this.GO[, 3] == "BP", ], "bp")
>>>>> 1: getEnrichedGO(FC2_annotated_regions, orgAnn = "org.Hs.eg.db",
>>>>> maxP = 0.01, multiAdj = FALSE, minGOterm = 1, feature_id_type =
>>>>> "ensembl_gene_id")
>>>>>
>>>>>
>>>>>
>>>>>> as.data.frame(my_annotated_regions[1:15,])
>>>>> space start end width names peak strand
>>>>> 1 1 241997936 241998205 270 R-10060 ENSE00001749374 R-10060 +
>>>>> 2 1 237109743 237110002 260 R-10082 ENSE00001643382 R-10082 +
>>>>> 3 1 236080267 236080415 149 R-10086 ENSE00001807176 R-10086 +
>>>>> 4 1 233853245 233853514 270 R-10096 ENSE00001776382 R-10096 +
>>>>> 5 1 233727956 233728104 149 R-10097 ENSE00001442190 R-10097 +
>>>>> 6 1 230728554 230728823 270 R-10108 ENSE00001731401 R-10108 +
>>>>> 7 1 229687129 229687277 149 R-10113 ENSE00001439385 R-10113 +
>>>>> 8 1 228943263 228943412 150 R-10121 ENSE00001903546 R-10121 +
>>>>> 9 1 218358885 218359176 292 R-10157 ENSE00001439386 R-10157 +
>>>>> 10 1 212254259 212254408 150 R-10179 ENSE00001624346 R-10179 +
>>>>> 11 1 210086264 210086513 250 R-10184 ENSE00001903225 R-10184 +
>>>>> 12 1 209863549 209863698 150 R-10185 ENSE00001336255 R-10185 +
>>>>> 13 1 207437117 207437264 148 R-10190 ENSE00001742112 R-10190 +
>>>>> 14 1 190352400 190352548 149 R-10246 ENSE00001782518 R-10246 +
>>>>> 15 1 184432607 184432755 149 R-10260 ENSE00001283926 R-10260 +
>>>>> feature start_position end_position insideFeature
>>>>> distancetoFeature
>>>>> 1 ENSE00001749374 241995237 241996089 downstream
>>>>> 2699
>>>>> 2 ENSE00001643382 237144639 237145008 upstream
>>>>> -34896
>>>>> 3 ENSE00001807176 236078715 236078821 downstream
>>>>> 1552
>>>>> 4 ENSE00001776382 233807017 233807237 downstream
>>>>> 46228
>>>>> 5 ENSE00001442190 233749750 233750272 upstream
>>>>> -21794
>>>>> 6 ENSE00001731401 230728406 230728586 overlapEnd
>>>>> 148
>>>>> 7 ENSE00001439385 229685652 229685769 downstream
>>>>> 1477
>>>>> 8 ENSE00001903546 228882063 228882416 downstream
>>>>> 61200
>>>>> 9 ENSE00001439386 218303137 218303294 downstream
>>>>> 55748
>>>>> 10 ENSE00001624346 212253973 212254092 downstream
>>>>> 286
>>>>> 11 ENSE00001903225 210111538 210111622 upstream
>>>>> -25274
>>>>> 12 ENSE00001336255 209859550 209859630 downstream
>>>>> 3999
>>>>> 13 ENSE00001742112 207438342 207438381 upstream
>>>>> -1225
>>>>> 14 ENSE00001782518 190331193 190331400 downstream
>>>>> 21207
>>>>> 15 ENSE00001283926 184446520 184446737 upstream
>>>>> -13913
>>>>> shortestDistance fromOverlappingOrNearest
>>>>> 1 1847 NearestStart
>>>>> 2 34637 NearestStart
>>>>> 3 1446 NearestStart
>>>>> 4 46008 NearestStart
>>>>> 5 21646 NearestStart
>>>>> 6 32 NearestStart
>>>>> 7 1360 NearestStart
>>>>> 8 60847 NearestStart
>>>>> 9 55591 NearestStart
>>>>> 10 167 NearestStart
>>>>> 11 25025 NearestStart
>>>>> 12 3919 NearestStart
>>>>> 13 1078 NearestStart
>>>>> 14 21000 NearestStart
>>>>> 15 13765 NearestStart
>>>>>
>>>>>
>>>>> Zhu, Lihua (Julie) wrote:
>>>>>> Hi Eric,
>>>>>>
>>>>>> Could you please post the session information with sessionInfo()
>>>>>> command?
>>>>>> Could you please also send a few ensembl IDs in your annotated
>>>>>> dataset?
>>>>>> Thanks!
>>>>>>
>>>>>> Best regards,
>>>>>>
>>>>>> Julie
>>>>>>
>>>>>>
>>>>>> On 1/4/11 6:51 PM, "Eric Cabot" <elcabot at gmail.com> wrote:
>>>>>>
>>>>>>> I am a relatively new Bioconductor user and I am trying to
>>>>>>> analyze some
>>>>>>> ChIP-seq results that came from QuEST using the ChIPpeakAnno
>>>>>>> package.
>>>>>>>
>>>>>>> After importing the regions of interest into RangedData objects
>>>>>>> and doing
>>>>>>> the following:
>>>>>>>
>>>>>>>
>>>>>> ENSEMBLE_GENES_MART<-useMart(biomart="ensembl",dataset="hsapiens_gene_ensem
>>>>>>
>>>>>> bl
>>>>>> ">
>>>>>> )
>>>>>>> ENSEMBL_ExonPlus_Annotation<-getAnnotation(ENSEMBLE_GENES_MART,
>>>>>>> featureType="ExonPlusUtr")
>>>>>>>
>>>>>>>
>>>>>>> I had no problem annotating and generating a Venn diagram to show
>>>>>>> the
>>>>>>> overlaps between my three sets of peaks. To annotate, I used:
>>>>>>>
>>>>>>> annotated_regions=annotatePeakInBatch(myranged,
>>>>>>> AnnotationData=ENSEMBL_ExonPlus_Annotation)
>>>>>>>
>>>>>>>
>>>>>>> But I cannot seem to get the getEnrichedGo method to work on this
>>>>>>> (or my
>>>>>>> other two annotated regions). Here is a typical command line:
>>>>>>>
>>>>>>>
>>>>>>> my_enrichedGO<-getEnrichedGO(annotated_regions,orgAnn="org.Hs.eg.db",maxP=
>>>>>>>
>>>>>>> 0.
>>>>>>> 01
>>>>>>> ,multiAdj=TRUE,minGOterm=1,
>>>>>>> multiAdjMethod="BH",feature_id_type="ensembl_gene_id")
>>>>>>>
>>>>>>> and here is a typical error message:
>>>>>>>
>>>>>>> enrichedGO<-getEnrichedGO(annotated_regions,orgAnn="org.Hs.eg.db",maxP=0.0
>>>>>>>
>>>>>>> 1,
>>>>>>> mu
>>>>>>> ltiAdj=TRUE,minGOterm=1,feature_id_type="ensembl_gene_id")
>>>>>>> Error in if (class(go.ids) != "matrix" | dim(go.ids)[2] < 4) { :
>>>>>>> argument is of length zero
>>>>>>>
>>>>>>>
>>>>>>> Which leads me to ask:
>>>>>>>
>>>>>>> 1) Is this error message supposed to be meaningful to me-i.e. a
>>>>>>> user-or is
>>>>>>> it something that I should be sending to the developer of the
>>>>>>> package?
>>>>>>>
>>>>>>> 2) Is there anything obvious from this that suggests what corrective
>>>>>>> action I should be taking?
>>>>>>>
>>>>>>>
>>>>>>> Eric Cabot
>>>>>>> University of Wisconsin
>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> Bioconductor mailing list
>>>>>>> Bioconductor at r-project.org
>>>>>>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>>>>>>> Search the archives:
>>>>>>> http://news.gmane.org/gmane.science.biology.informatics.conductor
>>>>>>>
>>>>
>>
>>