[BioC] easyRNAseq question

Nicolas Delhomme delhomme at embl.de
Sat May 26 12:51:32 CEST 2012


Dear Nirmala,

The BestExon works similarly to your workflow. Per gene, the count for the exon having the highest coverage is returned.

There are several reasons why I want to deprecate that function, the main two being:

1) It compares worse to microarray expression values than the geneModels approach.
2) RNA-Seq has a clear sequencing bias, i.e. the coverage of an exon is depending on many factors, both biological and technical, e.g. GC content, RNA fragmentation protocol, etc. This implies that the coverage varies within exon and across exon. Selecting a single exon introduces additional uncertainties, which are otherwise leveled across the gene's exons. That should not affect a direct comparison between samples, as the sequencing bias is highly reproducible from one sample to the next.

So, as using a best exon approach offers no advantage over a gene model approach, I'd advise you to choose that last one.

Best,

Nico

---------------------------------------------------------------
Nicolas Delhomme

Genome Biology Computational Support

European Molecular Biology Laboratory

Tel: +49 6221 387 8310
Email: nicolas.delhomme at embl.de
Meyerhofstrasse 1 - Postfach 10.2209
69102 Heidelberg, Germany
---------------------------------------------------------------





On May 25, 2012, at 11:07 PM, Akula, Nirmala (NIH/NIMH) [C] wrote:

> Dear Nico,
> 
> Thank you very much for your response. I did read the sections that you mentioned but would like to know more details about the BestExon method. Here is what I currently have:
> 
> 1. Map reads with TopHat
> 2. Create a bed file from the bam (each read is represented by only one base which is its starting position to make sure that the read does not fall on two different exons)
> 3. Use coverageBed to get the counts reads on each exon
> 4. For gene-level differential expression: Take only one exon/gene that has the maximum number of reads
> 5. Analyze the reads in DESeq
> 
> I would like to compare the above method to the BestExon method in easyRNAseq.
> 
> 
> Best,
> Nirmala
> 
> 
> 
> 
> -----Original Message-----
> From: Nicolas Delhomme [mailto:delhomme at embl.de] 
> Sent: Friday, May 25, 2012 3:03 AM
> To: Akula, Nirmala (NIH/NIMH) [C]
> Cc: bioconductor at r-project.org
> Subject: Re: easyRNAseq question
> 
> Dear Nirmala,
> 
> I've Cc'ed your email to the Bioconductor mailing list, as it might help other users.
> 
> Yes, there is currently a manuscript in review. 
> 
> As I'm not sure where you got your information from about the GeneModel summarization, I would direct you to read the new vignette of the development package: http://bioconductor.org/packages/2.11/bioc/html/easyRNASeq.html, page 10 and section 4.6. If that's what you've done or if the information there is not sufficient, let me know and I'll detail it more. By the way, the BestExon summarization did not really prove useful on the datasets I've been working on. I'm thinking about deprecating it.
> 
> Best,
> 
> Nico
> 
> ---------------------------------------------------------------
> Nicolas Delhomme
> 
> Genome Biology Computational Support
> 
> European Molecular Biology Laboratory
> 
> Tel: +49 6221 387 8310
> Email: nicolas.delhomme at embl.de
> Meyerhofstrasse 1 - Postfach 10.2209
> 69102 Heidelberg, Germany
> ---------------------------------------------------------------
> 
> 
> 
> 
> 
> On May 24, 2012, at 11:46 PM, Akula, Nirmala (NIH/NIMH) [C] wrote:
> 
>> Dear Nicolas,
>> 
>> Is there a publication that is available for easyRNAseq software? Also, you have mentioned that the transcripts are collapsed to genes by BestExon method and GeneModel summarization. Could you give details on these two methods?
>> 
>> Thank you very much.
>> 
>> Best Regards,
>> Nirmala
>> 
> 



More information about the Bioconductor mailing list