[BioC] variation between cells compared to samples

Simon Anders anders at ebi.ac.uk
Thu Feb 25 10:50:50 CET 2010


Hi Pete

On Wed, 24 Feb 2010 07:19:55 -0800, Pete Shepard <peter.shepard at gmail.com>
wrote:
> The distance matrix works nicely however, I am having trouble plotting
the
> two scv curves to one panel, any suggestions. I can give to cds "cds <-
> newCountDataSet( countsTable, conds )" a set of count data based on one
set
> of conditions conds <- c("Old", "Old", "New", "New" )
> obtain the variance " cds <- estimateVarianceFunctions( cds )" and plot
> them
> scvPlot(cds)
> 
> I can then change the conds <- c("Stem", "Stem", "Neuron", "Neuron" )
and
> repeat for this the same steps as above. But, I am having trouble
plotting
> the two sets of variances against eachother, on the same graph?

If you are only interested in the raw variance (i.e., trhe solid lines in
the SCV plot, which show the variance from sample differences, without the
shot noise coming from the counting), you can easily make the plot without
using scvPlot and the customize it to your liking. Just use 'rawVarFunc' to
get raw variance estimates for a condition, as in this example:

    library( DESeq )
    cds <- makeExampleCountDataSet( )
    cds <- estimateSizeFactors( cds )
    cds < -estimateVarianceFunctions( cds )
    xg <- 10^seq( 0, 3, length.out=100 )
    plot( xg, rawVarFunc( cds, "A" )( xg ) / xg^2, log="x", type='l',
ylim=c(0,.5) )
    lines( xg, rawVarFunc( cds, "B" )( xg ) / xg^2, log="x", col="red" )


BTW, when you made the distance matrix, have you estimated the variances
with 'pool=TRUE'? This is crucial, as otherwise you have a bias towards the
sample pairing that you specified for the conditions. (I hope I mention
this fact in the vignette and the help page.)

Cheers
   Simon

> 
> P
> 
> On Fri, Feb 19, 2010 at 2:35 PM, Wolfgang Huber <whuber at embl.de> wrote:
> 
>> Hi Pete
>>
>> 1. how do the two SCV curves look like if you plot them into one panel?
>> 2. as Steve suggests, you could compute the 4x4 distance matrix between
>> all
>> pairs of experiments, and perhaps visualise the distances with
>> multidimensional scaling or dendrogram / hierarchical clustering. For
>> this,
>> I'd use the variance stabilising transformation as described in Section
7
>> ("Sample Clustering") of the vignette, or the man page of the
>> "getVarianceStabilizedData function".
>>
>>             Best wishes
>>               Wolfgang
>>
>> Il giorno Feb 19, 2010, alle ore 7:23 PM, Pete Shepard ha scritto:
>>
>> Hi All,
>>
>> I am comparing four RNAseq experiments, exp # 1 and 2 are done using
>> protocol A and experiment 3 and 4 are done using protocol B.
Experiments
>> 1
>> and 3 are done using stem cells and experiment 2 and 4 are done using
>> neural
>> cells. I would like to see if there is more variation between the two
>> types
>> of protocols compared to the two types of cells. I have used the DESEQ
>> package to plot the squared coefficient of variation against the base
>> mean
>> but I am wondering if there is a single metric I can use to compare the
>> variations?



More information about the Bioconductor mailing list