[BioC] DE analysis with reference transcriptome

James W. MacDonald jmacdon at uw.edu
Fri May 23 15:15:16 CEST 2014


Hi Nicole,

Trinity has scripts that will generate counts from the RSEM results that 
you can then use as inputs for either edgeR or DESeq(2).

http://trinityrnaseq.sourceforge.net/analysis/diff_expression_analysis.html

Best,

Jim


On 5/22/2014 8:06 PM, Nicole Ertl wrote:
> Dear Bioconductor users,
>
>
>
> I'm working on a novel organism (no genome, only a reference transcriptome I had prepared with Trinity) and I have to do some differential gene expression analysis, using RNA-Seq data, produced with the Illumina TruSeq (non directional) kit. Most of my experiments have 2 conditons: control & treatment, one experiment has 4 conditions: control & 3 treatments. I have 6 biological replicates each per control/treatment.
>
>
>
> I've seen quite a few publications (BMC, PLOS one and others) that have aligned their reads to a reference transcriptome and then used RSEM or eXpress (+ sometimes FPKM/RPMK) to produce the count table which they then used with DESeq to analyse their data. Most don't really go into any sort of detail, so it's hard to follow what has been done. I've seen the "Count-based differential expression analysis of RNA sequencing data using R and Bioconductor" publication online and in it is mentioned that in the case of no genome, a reference transcriptome can be built, reads aligned to it and counted and then the standard pipeline for differential analysis used. The documentation for DESeq (and DESeq2), says to use raw counts, and nothing (rounded) normalised or counts of covered base pairs. I had a look at the RSEM and eXpress documentation and both seem to do some kind of estimation due to the isoforms inherent in a transcriptome? On the RSEM website it mentions that "popular diff!
>   erential expression (DE) analysis tools such as edgeR and DESeq do not take variance due to read mapping uncertainty into consideration. Beacause read mapping ambiguity is prevalent among isoforms and de novo assembled transcripts, these tools are not ideal for DE detection in such conditions." They suggest to use EBSeq, but I found max a handful of papers on google scholar that actually used RSEM-EBSeq. I'm new to all this and it's getting quite confusing. Could you please help? What would I have to do with my data and/or my reference transcriptome to be able to use eg the RSEM - DESeq (maybe DESeq2) pipeline? Is there a pipeline that you could recommend in my situation?
> Thank you so much for your time.
> Kind Regards,
> Nicole
> University of the Sunshine Coast, Locked Bag 4, Maroochydore DC, Queensland, 4558 Australia.
> CRICOS Provider No: 01595D
> Please consider the environment before printing this email.
> This email is confidential. If received in error, please delete it from your system.
>
> 	[[alternative HTML version deleted]]
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor

-- 
James W. MacDonald, M.S.
Biostatistician
University of Washington
Environmental and Occupational Health Sciences
4225 Roosevelt Way NE, # 100
Seattle WA 98105-6099



More information about the Bioconductor mailing list