[BioC] Getting DESEq results on just one gene

Simon Anders anders at embl.de
Sat Feb 2 18:15:04 CET 2013


Hi Fong

1. Yes, you can subset to just a single gene. You would run 
estimateDispersion on the whole data set, with method="blind", because 
you cannot divide up your data yet (as the copy-number genotype changes 
from gene to gene). For the test, just subset to a single gene with 
something like:

   cds1 <- cds[ mygene, ]
   conditions(cds1) <- copy_number_genotype_for_mygene
   nbinomTest( cds1, "A", "B" )

I haven't tested this. So if subseting to a _single_ row does not work, 
ask again, because these "drop=FALSE" extra options tend to often be 
missing, when you need them.

2. As you know, DESeq is optimized for working with small sample 
numbers. Once you have many samples, a non-parametric, permutation-based 
test often gives better result. You didn't say how many samples you have 
but given that you won't have much power otherwise, I guess it will be 
rather more than a dozen. And then, I would tend to also use a 
permutation-based test rather than DESeq.

In fact, I did a very similar analysis a while ago: We determined copy 
numbers of genes in the HapMap subjects and then wanted to know whether 
there is dosage compensation or not, i.e., whether subjects with a 
duplicate copy of a given gene have or have not twice the expression, 
too. Maybe have a look at our paper:

"Relating CNVs to transcriptome data at fine resolution: Assessment of 
the effect of variant size, type, and overlap with functional regions"
Andreas Schlattl, Simon Anders, Sebastian M. Waszak, Wolfgang Huber, Jan 
O. Korbel.
Genome Research 21 (2011) 2004-2013; doi: 10.1101/gr.122614.111

   Simon



More information about the Bioconductor mailing list