[BioC] Error of GTF Annotation in easyRNASeq

Nicolas Delhomme delhomme at embl.de
Fri Sep 21 10:26:27 CEST 2012


Moreover, to make sure that this is not a package conflict can you please NOT load the library(RnaSeqTutorial). You do not need it to run easyRNASeq. So your script should read:

library(easyRNASeq)
library(BSgenome.Mmusculus.UCSC.mm9)

setwd("/home/gao/RNA")

## the "." is your current directory.
count.table <- easyRNASeq(".",
pattern=".sorted.bam$",
organism="MMusculus",
annotationMethod="gtf",
annotationFile="mm9gene.gtf",
count="genes",
summarization="geneModels",
normalize=TRUE
)


Cheers,

Nico

---------------------------------------------------------------
Nicolas Delhomme

Genome Biology Computational Support

European Molecular Biology Laboratory

Tel: +49 6221 387 8310
Email: nicolas.delhomme at embl.de
Meyerhofstrasse 1 - Postfach 10.2209
69102 Heidelberg, Germany
---------------------------------------------------------------





On Sep 21, 2012, at 10:19 AM, Nicolas Delhomme wrote:

> Dear Dadi,
> 
> I will need a little more information from you. In addition, it's best if you post such emails to the bioconductor mailing list (which I've Cced, so please "answer to all" when you reply.). See there for subscribing: http://www.bioconductor.org/help/mailing-list/. What I need to know from you first is what is described in that page: http://www.bioconductor.org/help/mailing-list/posting-guide/ mainly under the sections preparing and composing. In essence I need to know what version of R and bioconductor packages you are using.
> 
> Then, installing you package in the installation directory of an existing package is not the safest. You might a) disrupt that package functionality b) possibly lose your data if that package gets updated. You'd rather move your RNA folder to you home directory and use that directory, e.g. /home/gao/RNA instead. Using the setwd command, you can make that your current working dir.
> 
> So the following two blocks results in the same:
> 
> setwd("/home/gao/RNA")
> 
> ## the "." is your current directory.
> count.table <- easyRNASeq(".",
> pattern=".sorted.bam$",
> organism="MMusculus",
> annotationMethod="gtf",
> annotationFile="mm9gene.gtf",
> count="genes",
> summarization="geneModels",
> normalize=TRUE
> )
> 
> Or:
> 
> count.table <- easyRNASeq("/home/gao/RNA",
> pattern=".sorted.bam$",
> organism="MMusculus",
> annotationMethod="gtf",
> annotationFile="/home/gao/RNA /mm9gene.gtf",
> count="genes",
> summarization="geneModels",
> normalize=TRUE
> )
> 
> Now, for the error, can you please tell me more about what aligner you used for you data , whether it is Paired-End or not and finally whether the reads have been dynamically trimmed (i.e. if reads of variable length are expected ) or not?
> 
> What actually bothers me in your error is that it mentions: 
> 
> easyRNASeq(system.file("miRNA", package = "RnaSeqTutorial"), 
> 
> instead of 
> 
> easyRNASeq(system.file("RNA", package="RnaSeqTutorial"),
> 
> i.e. miRNA instead of RNA. So to make sure that the error is reproducible can you move your RNA folder to a different directory and re-run the command as above? I don't expect this to solve the error though, but at least we'd have a "cleaner" setup for reproducing it.
> 
> Best,
> 
> Nico
> 
> ---------------------------------------------------------------
> Nicolas Delhomme
> 
> Genome Biology Computational Support
> 
> European Molecular Biology Laboratory
> 
> Tel: +49 6221 387 8310
> Email: nicolas.delhomme at embl.de
> Meyerhofstrasse 1 - Postfach 10.2209
> 69102 Heidelberg, Germany
> ---------------------------------------------------------------
> 
> 
> 
> 
> 
> On Sep 21, 2012, at 2:41 AM, Dadi Gao wrote:
> 
>> Dear Dr. Delhomme,
>> 
>> I'm currently study gene expression pattern from deep sequencing data of mouse blood cell using easyRNASeq.
>> I created a folder called "RNA" under R package RnaSeqTutorial path.
>> Within this folder, I put 3 RNA-seq data files called "N1.sorted.bam", "N2.sorted.bam" and "N3.sorted.bam", with their bam index files.
>> It also contains a GTF file for mouse gene annotation downloaded from UCSC, called "mm9gene.gtf".
>> 
>> I'm using the following code to normalize the gene expression:
>> 
>> library(easyRNASeq)
>> library(RnaSeqTutorial)
>> library(BSgenome.Mmusculus.UCSC.mm9)
>> 
>> count.table <- easyRNASeq(system.file("RNA", package="RnaSeqTutorial"),
>> pattern=".sorted.bam$",
>> organism="MMusculus",
>> annotationMethod="gtf",
>> annotationFile=system.file("RNA", "mm9gene.gtf", package="RnaSeqTutorial"),
>> count="genes",
>> summarization="geneModels",
>> normalize=TRUE
>> )
>> 
>> But this runs with an error as:
>> 
>> Checking arguments... 
>> Fetching annotations... 
>> Read 962651 records
>> Warning message:
>> In easyRNASeq(system.file("miRNA", package = "RnaSeqTutorial"),  :
>> You enforce UCSC chromosome conventions, however the provided chromosome size list is not compliant. Correcting it.
>> Error in all.annotation[all.annotation$type %in% annotation.type, ] : 
>> error in evaluating the argument 'i' in selecting a method for function '[': Error in all.annotation$type %in% annotation.type : 
>> error in evaluating the argument 'x' in selecting a method for function '%in%': Error in function (classes, fdef, mtable)  : 
>> unable to find an inherited method for function "annotation", for signature "Genome_intervals_stranded"
>> 
>> Did I do something wrong?
>> 
>> Sincerely yours,
>> Dadi Gao
>> 
>> Bioinformatics Group
>> Centenary Institute
>> Building 93, Royal Prince Alfred Hospital
>> Missenden Rd, Camperdown, NSW 2050
>> Australia
>> 
> 
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor



More information about the Bioconductor mailing list