[BioC] easyRNASeq error

Nicolas Delhomme delhomme at embl.de
Sat Mar 3 14:58:24 CET 2012


Hi Wade,

On 2 Mar 2012, at 22:38, Davis, Wade wrote:

> Hi Nico,
> I tried your first solution to the second problem with these results:
> 
>> #rerun you command with the chr.map argument and a "custom" organism argument
>> rnaSeq2 <- easyRNASeq(filesDirectory=extdataDir,
> +                       organism="custom",
> +                       chr.sizes=chr.sizes,
> +                       chr.map=chr.map,
> +                       filter=myfilt,
> +                       readLength=50L,
> +                       annotationMethod="biomaRt",
> +                        format="aln",
> +                       count="exons",
> +                       pattern="^1-Feb_ATCACG_L003_R1_001_export.txt.gz$",
> +                       outputFormat="RNAseq"
> +                       )
> Checking arguments... 
> Fetching annotations... 
> Error in .getBmRange(organismName(obj), ignoreWarnings = ignoreWarnings,  : 
>  The organism custom is not supported by the ensembl biomaRt.
> 

Right, I did not think of that. You can easily fetch the annotation with easyRNASeq. Now that I have the data, I'll give it a try and post the code afterwards.


> 
> I could try your second solution, but I don't have any local annotation file. I could try to get one. From what I could find on the internet, I think the sequencing core used a refflat file (?), which I think is the standard format of ELAND annotation file. I can contact the sequencing core and try to get what they used. But in general, I like the flexibility of BiomaRt.

As I said it can be fetched by easyRNASeq.

> 
> A third option is to use samtools /perl and edit the header / chromosome names and export as BAM. In the future, I think I am just going to ask the core for BAM files with "more common" chromosome naming.
> 

Yes, having less complexity is always best. I had the same "issue" with our sequencing facility and it turned out to be easy for them to edit the reference file they were using (pure fasta) for ELAND.

Cheers,

Nico


> 
> Thanks,
> Wade
> 
> 
> 
> 
> -----Original Message-----
> From: Nicolas Delhomme [mailto:delhomme at embl.de] 
> Sent: Friday, March 02, 2012 8:37 AM
> To: Davis, Wade
> Cc: Bioconductor mailing list
> Subject: Re: [BioC] easyRNASeq error
> 
> Hi Wade,
> 
> Great! Thanks for the feedback!
> 
> Let me know for the other issue once you've got the time to try.
> 
> Cheers,
> 
> Nico
> 
> ---------------------------------------------------------------
> Nicolas Delhomme
> 
> Genome Biology Computational Support
> 
> European Molecular Biology Laboratory
> 
> Tel: +49 6221 387 8310
> Email: nicolas.delhomme at embl.de
> Meyerhofstrasse 1 - Postfach 10.2209
> 69102 Heidelberg, Germany
> ---------------------------------------------------------------
> 
> 
> 
> 
> 
> On 2 Mar 2012, at 15:33, Davis, Wade wrote:
> 
>> Hi Nico,
>> Everything worked fine on your examples. Please see below for output.
>> 
>> Thanks,
>> Wade
>> 
>> 
>> R Under development (unstable) (2012-02-28 r58513) Copyright (C) 2012 
>> The R Foundation for Statistical Computing ISBN 3-900051-07-0
>> Platform: x86_64-pc-mingw32/x64 (64-bit)
>> 
>> R is free software and comes with ABSOLUTELY NO WARRANTY.
>> You are welcome to redistribute it under certain conditions.
>> Type 'license()' or 'licence()' for distribution details.
>> 
>> Natural language support but running in an English locale
>> 
>> R is a collaborative project with many contributors.
>> Type 'contributors()' for more information and 'citation()' on how to 
>> cite R or R packages in publications.
>> 
>> Type 'demo()' for some demos, 'help()' for on-line help, or 
>> 'help.start()' for an HTML browser interface to help.
>> Type 'q()' to quit R.
>> 
>>> utils:::menuInstallLocal()
>>> utils:::menuInstallLocal()
>>> libary(easyRNASeq)
>> Error: could not find function "libary"
>>> library(easyRNASeq)
>> Loading required package: parallel
>> Loading required package: genomeIntervals Loading required package: 
>> intervals Loading required package: Biobase Loading required package: 
>> BiocGenerics
>> 
>> Attaching package: 'BiocGenerics'
>> 
>> The following object(s) are masked from 'package:stats':
>> 
>>   xtabs
>> 
>> The following object(s) are masked from 'package:base':
>> 
>>   anyDuplicated, cbind, colnames, duplicated, eval, Filter, Find,
>>   intersect, lapply, Map, mapply, order, paste, pmax, pmax.int, pmin,
>>   pmin.int, Position, rbind, Reduce, rep.int, rownames, sapply,
>>   setdiff, table, tapply, union, unique
>> 
>> Welcome to Bioconductor
>> 
>>   Vignettes contain introductory material; view with
>>   'browseVignettes()'. To cite Bioconductor, see
>>   'citation("Biobase")', and for packages 'citation("pkgname")'.
>> 
>> Loading required package: biomaRt
>> Loading required package: edgeR
>> Loading required package: limma
>> Loading required package: Biostrings
>> Loading required package: IRanges
>> 
>> Attaching package: 'IRanges'
>> 
>> The following object(s) are masked from 'package:intervals':
>> 
>>   reduce
>> 
>> 
>> Attaching package: 'Biostrings'
>> 
>> The following object(s) are masked from 'package:intervals':
>> 
>>   type
>> 
>> Loading required package: BSgenome
>> Loading required package: GenomicRanges
>> 
>> Attaching package: 'GenomicRanges'
>> 
>> The following object(s) are masked from 'package:genomeIntervals':
>> 
>>   strand, strand<-
>> 
>> Loading required package: DESeq
>> Loading required package: locfit
>> Loading required package: akima
>> Loading required package: lattice
>> locfit 1.5-6     2010-01-20 
>> Loading required package: Rsamtools
>> Loading required package: ShortRead
>> Loading required package: latticeExtra Loading required package: 
>> RColorBrewer
>> 
>> Attaching package: 'ShortRead'
>> 
>> The following object(s) are masked from 'package:locfit':
>> 
>>   left, right
>> 
>> Warning messages:
>> 1: replacing previous import 'coerce' when loading 'intervals' 
>> 2: replacing previous import 'initialize' when loading 'intervals' 
>>> sessionInfo()
>> R Under development (unstable) (2012-02-28 r58513)
>> Platform: x86_64-pc-mingw32/x64 (64-bit)
>> 
>> locale:
>> [1] LC_COLLATE=English_United States.1252  LC_CTYPE=English_United States.1252   
>> [3] LC_MONETARY=English_United States.1252 LC_NUMERIC=C                          
>> [5] LC_TIME=English_United States.1252    
>> 
>> attached base packages:
>> [1] parallel  stats     graphics  grDevices utils     datasets  methods   base     
>> 
>> other attached packages:
>> [1] easyRNASeq_1.1.8       ShortRead_1.13.13      latticeExtra_0.6-20    RColorBrewer_1.0-5    
>> [5] Rsamtools_1.7.37       DESeq_1.7.6            locfit_1.5-6           lattice_0.20-0        
>> [9] akima_0.5-7            BSgenome_1.23.2        GenomicRanges_1.7.28   Biostrings_2.23.6     
>> [13] IRanges_1.13.26        edgeR_2.5.9            limma_3.11.15          biomaRt_2.11.1        
>> [17] Biobase_2.15.3         BiocGenerics_0.1.7     genomeIntervals_1.11.0 intervals_0.13.3      
>> 
>> loaded via a namespace (and not attached):
>> [1] annotate_1.33.2       AnnotationDbi_1.17.22 bitops_1.0-4.1        DBI_0.2-5             genefilter_1.37.0    
>> [6] geneplotter_1.33.1    grid_2.15.0           hwriter_1.3           RCurl_1.91-1.1        RSQLite_0.11.1       
>> [11] splines_2.15.0        survival_2.36-12      tools_2.15.0          XML_3.9-4.1           xtable_1.7-0         
>> [16] zlibbioc_1.1.1       
>>> library("easyRNASeq")
>>> library("RnaSeqTutorial")
>>> library(BSgenome.Dmelanogaster.UCSC.dm3)
>>> 
>>> rnaSeq <- easyRNASeq(system.file(
>> +                                   "extdata",
>> +                                   package="RnaSeqTutorial"),
>> +                       organism="Dmelanogaster",
>> +                       chr.sizes=as.list(seqlengths(Dmelanogaster)),
>> +                       readLength=36L,
>> +                       annotationMethod="rda",
>> +                       annotationFile=system.file(
>> +                         "data",
>> +                         "gAnnot.rda",
>> +                         package="RnaSeqTutorial"),
>> +                       format="bam",
>> +                       count="genes",
>> +                       summarization="geneModels",
>> +                       pattern="[A,C,T,G]{6}\\.bam$",
>> +                       outputFormat="RNAseq")
>> Checking arguments... 
>> Fetching annotations... 
>> Computing gene models... 
>> Summarizing counts... 
>> Processing ACACTG.bam
>> Processing ACTAGC.bam
>> Processing ATGGCT.bam
>> Processing TTGCGA.bam
>> Preparing output
>> Warning message:
>> In easyRNASeq(system.file("extdata", package = "RnaSeqTutorial"),  :
>> There are 2238 synthetic exons as determined from your annotation that overlap! This implies that some reads will be counted more than once! Is that really what you want?
>>> rnaSeqPar <- easyRNASeq(system.file(
>> +                                   "extdata",
>> +                                   package="RnaSeqTutorial"),
>> +                       organism="Dmelanogaster",
>> +                       chr.sizes=as.list(seqlengths(Dmelanogaster)),
>> +                       readLength=36L,
>> +                       annotationMethod="rda",
>> +                       annotationFile=system.file(
>> +                         "data",
>> +                         "gAnnot.rda",
>> +                         package="RnaSeqTutorial"),
>> +                       format="bam",
>> +                       count="genes",
>> +                       summarization="geneModels",
>> +                       pattern="[A,C,T,G]{6}\\.bam$",
>> +                       outputFormat="RNAseq",
>> + nbCore=2)
>> Checking arguments... 
>> Fetching annotations... 
>> Computing gene models... 
>> Summarizing counts... 
>> Processing ACACTG.bam
>> Processing ACTAGC.bam
>> Processing ATGGCT.bam
>> Processing TTGCGA.bam
>> Preparing output
>> Warning message:
>> In easyRNASeq(system.file("extdata", package = "RnaSeqTutorial"),  :
>> There are 2238 synthetic exons as determined from your annotation that overlap! This implies that some reads will be counted more than once! Is that really what you want?
>>> 
> 



More information about the Bioconductor mailing list