[BioC] ChIPpeakAnno to find peaks nearest to miRNA

Paolo Kunderfranco paolo.kunderfranco at gmail.com
Mon Jul 30 10:51:51 CEST 2012


Hello,
Ok perfect now is working fine,
Thanks again for your precious help,
Paolo


2012/7/27 Ou, Jianhong <Jianhong.Ou at umassmed.edu>:
> Hi Paolo,
>
> Because the org database do not contain the info for ENSMUSG00000089245, there will show an error by addGeneIDs.
> In this case, you'd better use biomaRt to get the annotation, please try,
>
> feature_ids <- unique(annotatedPeak$feature)
> feature_ids<-feature_ids[!is.na(feature_ids)]
> feature_ids<-feature_ids[feature_ids!=""]
> mart<-useMart(biomart="ensembl",dataset="mmusculus_gene_ensembl")
> IDs2Add<-getBM(attributes=c("ensembl_gene_id","mirbase_transcript_name","mirbase_id","mirbase_accession","external_gene_id"),filters = "ensembl_gene_id", values = feature_ids, mart=mart)
> duplicated_ids<-IDs2Add[duplicated(IDs2Add[,"ensembl_gene_id"]),"ensembl_gene_id"]
> if(length(duplicated_ids)>0){
>         IDs2Add.duplicated<-IDs2Add[IDs2Add[,"ensembl_gene_id"] %in% duplicated_ids,]
>         IDs2Add.duplicated<-condenseMatrixByColnames(as.matrix(IDs2Add.duplicated),"ensembl_gene_id")
>         IDs2Add<-IDs2Add[!(IDs2Add[,"ensembl_gene_id"] %in% duplicated_ids),]
>         IDs2Add<-rbind(IDs2Add,IDs2Add.duplicated)
> }
>
> And then merge the useful information to the annotatedPeak.
>
> If you have any questions, please let me know.
>
> Yours sincerely,
>
> Jianhong Ou
>
> jianhong.ou at umassmed.edu
>
>
> On Jul 27, 2012, at 9:57 AM, Zhu, Lihua (Julie) wrote:
>
>> Paolo,
>>
>> Could you please send us a few rows of miRNAs in annotatedPeaks? Thanks!
>>
>> Best regards,
>>
>> Julie
>> ________________________________________
>> From: bioconductor-bounces at r-project.org [bioconductor-bounces at r-project.org] on behalf of Paolo Kunderfranco [paolo.kunderfranco at gmail.com]
>> Sent: Friday, July 27, 2012 5:50 AM
>> To: bioconductor at r-project.org
>> Subject: [BioC] ChIPpeakAnno to find peaks nearest to miRNA
>>
>> Dear All,
>> I would like to use ChIPpeakAnno to find peaks nearest to miRNA.
>>
>> I loaded my bed file and created a ranged data, load
>> mmusculus_gene_ensembl dataset through mart and annotated my peaks,
>> and it seems ok,
>>
>> test.rangedData = BED2RangedData(test.bed)
>> mart<-useMart(biomart="ensembl",dataset="mmusculus_gene_ensembl")
>> Annotation = getAnnotation(mart, featureType="miRNA")
>> annotatedPeak = annotatePeakInBatch(test.rangedData, AnnotationData=Annotation)
>> as.data.frame(annotatedPeak)
>>
>> <factor>            <IRanges> |   <character> <character>
>> <character>      <numeric>    <numeric>   <character>
>> MACS_peak_109 ENSMUSG00000089245        1 [54494876, 54496209] |
>> MACS_peak_109           + ENSMUSG00000089245       54826062
>> 54826166      upstream
>> numeric>        <numeric>              <character>
>> -331186           329853             NearestStart
>>
>>
>> Now I would like to add miRNA Id as I already did when I annotated for
>> TSS, but something goes wrong, any ideas how to solve it?
>>
>> library("org.Mm.eg.db")
>> b<- addGeneIDs(annotatedPeak,"org.Mm.eg.db",c("symbol"))
>> Error: No entrez identifier can be mapped by input data based on the
>> feature_id_type. Please consider to use correct feature_id_type,
>> orgAnn or annotatedPeak
>>
>>
>> Thanks,
>>
>> Paolo
>>
>>
>>> traceback()
>> 2: stop("No entrez identifier can be mapped by input data based on the
>> feature_id_type.\nPlease consider to use correct feature_id_type,
>> orgAnn or annotatedPeak\n",
>>       call. = FALSE)
>> 1: addGeneIDs(annotatedPeak, "org.Mm.eg.db", c("symbol"))
>>> sessionInfo()
>> R version 2.15.0 (2012-03-30)
>> Platform: i386-pc-mingw32/i386 (32-bit)
>>
>> locale:
>> [1] LC_COLLATE=Italian_Italy.1252  LC_CTYPE=Italian_Italy.1252
>> LC_MONETARY=Italian_Italy.1252 LC_NUMERIC=C
>> [5] LC_TIME=Italian_Italy.1252
>>
>> attached base packages:
>> [1] grid      stats     graphics  grDevices utils     datasets
>> methods   base
>>
>> other attached packages:
>> [1] targetscan.Mm.eg.db_0.5.0           BiocInstaller_1.4.7
>>      org.Mm.eg.db_2.7.1                  ChIPpeakAnno_2.4.0
>> [5] limma_3.12.1                        org.Hs.eg.db_2.7.1
>>      GO.db_2.7.1                         RSQLite_0.11.1
>> [9] DBI_0.2-5                           AnnotationDbi_1.18.1
>>      BSgenome.Ecoli.NCBI.20080805_1.3.17 BSgenome_1.24.0
>> [13] GenomicRanges_1.8.7                 Biostrings_2.24.1
>>      IRanges_1.14.4                      multtest_2.12.0
>> [17] Biobase_2.16.0                      biomaRt_2.12.0
>>      BiocGenerics_0.2.0                  gplots_2.11.0
>> [21] MASS_7.3-19                         KernSmooth_2.23-8
>>      caTools_1.13                        bitops_1.0-4.1
>> [25] gdata_2.11.0                        gtools_2.7.0
>>
>> loaded via a namespace (and not attached):
>> [1] RCurl_1.91-1.1   splines_2.15.0   stats4_2.15.0
>> survival_2.36-14 tools_2.15.0     XML_3.9-4.1
>>
>> _______________________________________________
>> Bioconductor mailing list
>> Bioconductor at r-project.org
>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
>



More information about the Bioconductor mailing list