[BioC] Getting DESEq results on just one gene
Simon Anders
anders at embl.de
Sat Feb 2 18:15:04 CET 2013
Hi Fong
1. Yes, you can subset to just a single gene. You would run
estimateDispersion on the whole data set, with method="blind", because
you cannot divide up your data yet (as the copy-number genotype changes
from gene to gene). For the test, just subset to a single gene with
something like:
cds1 <- cds[ mygene, ]
conditions(cds1) <- copy_number_genotype_for_mygene
nbinomTest( cds1, "A", "B" )
I haven't tested this. So if subseting to a _single_ row does not work,
ask again, because these "drop=FALSE" extra options tend to often be
missing, when you need them.
2. As you know, DESeq is optimized for working with small sample
numbers. Once you have many samples, a non-parametric, permutation-based
test often gives better result. You didn't say how many samples you have
but given that you won't have much power otherwise, I guess it will be
rather more than a dozen. And then, I would tend to also use a
permutation-based test rather than DESeq.
In fact, I did a very similar analysis a while ago: We determined copy
numbers of genes in the HapMap subjects and then wanted to know whether
there is dosage compensation or not, i.e., whether subjects with a
duplicate copy of a given gene have or have not twice the expression,
too. Maybe have a look at our paper:
"Relating CNVs to transcriptome data at fine resolution: Assessment of
the effect of variant size, type, and overlap with functional regions"
Andreas Schlattl, Simon Anders, Sebastian M. Waszak, Wolfgang Huber, Jan
O. Korbel.
Genome Research 21 (2011) 2004-2013; doi: 10.1101/gr.122614.111
Simon
More information about the Bioconductor
mailing list