[BioC] DEseq for sample clustering

Wolfgang Huber whuber at embl.de
Thu Nov 10 23:57:59 CET 2011

Dear Linn

each sample's data corresponds to a column in the data matrix of a
countDataSet, and it seems that your question boils down on how to
1. subset columns of a matrix
2. compute average vector from sets of columns of a matrix.

For 1., you can do something like

s1 = pasillaGenes[, pasillaGenes$type=="single-read"]

For 2., see the 'ave' function in the 'stats' package, or more pedestrian:

sp = with(pData(pasillaGenes),
        split(seq(along=condition), condition))
mn = do.call(cbind,
   lapply(sp, function(i)

where 'vsd' is the data after variance stabilising transformation as 
described in the vignette.

	Best wishes

Nov/10/11 1:45 PM, Linn Fagerberg [guest] scripsit::
> I have used the functions described in the DEseq package information
> for clustering and heatmap visualization of RNA-seq data with great
> results. However I am a bit confused whether I may be able to use the
> conds argument for my count dataset. When I have replicate samples I
> would like to get only the ones specified in the conds vector as the
> nodes in the dendrogram of the heatmap. Is this possible to do using
> methods in the DEseq package or do I have to calculate average values
> for the replicates manually before I obtain the distances?
> -- output of sessionInfo():
> -- Sent via the guest posting facility at bioconductor.org.
> _______________________________________________ Bioconductor mailing
> list Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor Search the
> archives:
> http://news.gmane.org/gmane.science.biology.informatics.conductor


Wolfgang Huber

More information about the Bioconductor mailing list