[BioC] RNAseq expression threshold using DESeq2 normalised counts

QAMRA Aditi (GIS) qamraa99 at gis.a-star.edu.sg
Sat Jul 19 18:41:36 CEST 2014


Sorry I posted the wrong link while referring to the paper using zFPKM transformation. here it is -

http://www.biomedcentral.com/1471-2164/14/778
________________________________________
From: Aditi [guest] [guest at bioconductor.org]
Sent: Saturday, July 19, 2014 11:51 PM
To: bioconductor at r-project.org; QAMRA Aditi (GIS)
Cc: DESeq2 Maintainer
Subject: RNAseq expression threshold using DESeq2 normalised counts

Hi Mike,

This is a question similar to posted on biostars a few months ago (https://www.biostars.org/p/94680/) that you came across.

I want to determine if a gene is expressed or not using RNAseq data. Though there is quite a discussion on it with papers defining range of FPKM values (generally generated using cufflinks ) as a cutoff to say that a gene is expressed.

Can we rather use normalised counts from DESeq2- look at the distribution and determine a suitable cutoff. Better still if one has negative controls like spike ins in the RNA protocol use that a cutoff ? ( I unfortunately dont have spike in control data)

Or do you think one should extract FPKM values and then use maybe a zFPKM transformation (http://www.ploscompbiol.org/article/info%3Adoi%2F10.1371%2Fjournal.pcbi.1000598) like most people are suggesting

I look forward to your opinion and suggestion,

Thanks !
Aditi





 -- output of sessionInfo():

 -- output of sessionInfo():

R version 3.1.0 (2014-04-10)
Platform: x86_64-unknown-linux-gnu (64-bit)

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C
 [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8
 [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8
 [7] LC_PAPER=en_US.UTF-8       LC_NAME=C
 [9] LC_ADDRESS=C               LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

attached base packages:
[1] parallel  stats     graphics  grDevices utils     datasets  methods
[8] base

other attached packages:
 [1] DESeq2_1.4.5            RcppArmadillo_0.4.300.0 Rcpp_0.11.1
 [4] EDASeq_1.10.0           aroma.light_2.0.0       matrixStats_0.8.14
 [7] ShortRead_1.22.0        GenomicAlignments_1.0.1 BSgenome_1.32.0
[10] Rsamtools_1.16.0        GenomicRanges_1.16.3    GenomeInfoDb_1.0.2
[13] Biostrings_2.32.0       XVector_0.4.0           IRanges_1.22.7
[16] BiocParallel_0.6.1      Biobase_2.24.0          BiocGenerics_0.10.0

loaded via a namespace (and not attached):
 [1] annotate_1.42.0      AnnotationDbi_1.26.0 BatchJobs_1.2
 [4] BBmisc_1.6           bitops_1.0-6         brew_1.0-6
 [7] codetools_0.2-8      DBI_0.2-7            DESeq_1.16.0
[10] digest_0.6.4         fail_1.2             foreach_1.4.2
[13] genefilter_1.46.1    geneplotter_1.42.0   grid_3.1.0
[16] hwriter_1.3          iterators_1.0.7      lattice_0.20-29
[19] latticeExtra_0.6-26  locfit_1.5-9.1       plyr_1.8.1
[22] RColorBrewer_1.0-5   R.methodsS3_1.6.1    R.oo_1.18.0
[25] RSQLite_0.11.4       sendmailR_1.1-2      splines_3.1.0
[28] stats4_3.1.0         stringr_0.6.2        survival_2.37-7
[31] tools_3.1.0          XML_3.98-1.1         xtable_1.7-3
[34] zlibbioc_1.10.0

--
Sent via the guest posting facility at bioconductor.org.

-------------------------------
This e-mail and any attachments are only for the use of the intended recipient and may be confidential and/or privileged. If you are not the recipient, please delete it or notify the sender immediately. Please do not copy or use it for any purpose or disclose the contents to any other person as it may be an offence under the Official Secrets Act.



More information about the Bioconductor mailing list