[BioC] eayRNASeq with Ensemble GRCh37 help

Nicolas Delhomme delhomme at embl.de
Tue Sep 17 11:05:33 CEST 2013


Hej Aki Hoji!

You can indeed ignore the warnings. The error is this:

> The number of conditions: 0 did not correspond to the number of samples: 1

For using the DESeq output, you need to precise the conditions, see the ?easyRNASeq help page and the easyRNASeq and DESeq vignettes (e.g. vignette("easyRNASeq")) for more details on the arguments and how to use DESeq. Even if you provide a condition, easyRNASeq is bound to fail again as DESeq can't work with a single sample.

Finally, note that easyRNASeq as of now only returns a DESeq and not DESeq2 output (i.e. a CountDataSet and not a SummarizedExperiment). This is planned for next release, planned early October.

Best,

Nico

---------------------------------------------------------------
Nicolas Delhomme

Genome Biology Computational Support

European Molecular Biology Laboratory

Tel: +49 6221 387 8310
Email: nicolas.delhomme at embl.de
Meyerhofstrasse 1 - Postfach 10.2209
69102 Heidelberg, Germany
---------------------------------------------------------------





On 16 Sep 2013, at 20:17, Aki Hoji wrote:

> Hi, 
> 
> I've been trying to generate an output file for DESeq2 by easyRNASeq.  An input file is a BAM generated by Tophat2/Bowtie2 with Ensemble GRCh37.72 which was a part of Illumina's  iGenome package.   I followed the overview and samples of easyRNASeq in a BioC mailing list  and fired up a following;
> 
> testcount<-easyRNASeq(filesDirectory=getwd(), organism="Hsapiens", chr.sizes="auto", readLength=100L, annotationMethod="gtf", annotationFile="Ensemble.gtf", count="exons", outputFormat="DESeq", filenames="4673Bsorted.bam")
> 
> Then I got this error;
> 
> Checking arguments... 
> Fetching annotations... 
> Read 2280612 records
> Error in easyRNASeq(filesDirectory = getwd(), organism = "Hsapiens", chr.sizes = "auto",  : 
>  The number of conditions: 0 did not correspond to the number of samples: 1
> In addition: Warning messages:
> 1: In easyRNASeq(filesDirectory = getwd(), organism = "Hsapiens", chr.sizes = "auto",  :
>  You enforce UCSC chromosome conventions, however the provided chromosome size list is not compliant. Correcting it.
> 2: In .Method(..., deparse.level = deparse.level) :
>  number of columns of result is not a multiple of vector length (arg 1)
> 3: In easyRNASeq(filesDirectory = getwd(), organism = "Hsapiens", chr.sizes = "auto",  :
>  There are 966272 features/exons defined in your annotation that overlap! This implies that some reads will be counted more than once! Is that really what you want?
> 4: In easyRNASeq(filesDirectory = getwd(), organism = "Hsapiens", chr.sizes = "auto",  :
>  You enforce UCSC chromosome conventions, however the provided annotation is not compliant. Correcting it.
> 
> As far as I can tell, I am not really enforcing the UCSC chromosome convention, and chr.sizes could be set to auto since the BAM file is used.  I am getting stuck at this point and any help/pointer  will be really appreciated. 
> 
> Thanks. 
> 
> AH
> 
>> sessionInfo()
> R version 3.0.1 (2013-05-16)
> Platform: x86_64-apple-darwin10.8.0 (64-bit)
> 
> locale:
> [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
> 
> attached base packages:
> [1] parallel  stats     graphics  grDevices utils     datasets  methods   base     
> 
> other attached packages:
> [1] easyRNASeq_1.6.0       ShortRead_1.18.0       latticeExtra_0.6-26  
> [4] RColorBrewer_1.0-5     Rsamtools_1.12.4       DESeq_1.12.1         
> [7] lattice_0.20-23        locfit_1.5-9.1         BSgenome_1.28.0      
> [10] GenomicRanges_1.12.5   Biostrings_2.28.0      IRanges_1.18.3        
> [13] edgeR_3.2.4            limma_3.16.7           biomaRt_2.16.0        
> [16] Biobase_2.20.1         genomeIntervals_1.16.0 BiocGenerics_0.6.0    
> [19] intervals_0.14.0       BiocInstaller_1.10.3  
> 
> loaded via a namespace (and not attached):
> [1] annotate_1.38.0      AnnotationDbi_1.22.6 bitops_1.0-6        
> [4] DBI_0.2-7            genefilter_1.42.0    geneplotter_1.38.0  
> [7] grid_3.0.1           hwriter_1.3          RCurl_1.95-4.1      
> [10] RSQLite_0.11.4       splines_3.0.1        stats4_3.0.1        
> [13] survival_2.37-4      tools_3.0.1          XML_3.95-0.2        
> [16] xtable_1.7-1         zlibbioc_1.6.0 
> 
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor



More information about the Bioconductor mailing list