[BioC] [DIFFBIND] batch effects and blocking factors

Giuseppe Gallone giuseppe.gallone at dpag.ox.ac.uk
Tue Jun 24 11:34:07 CEST 2014


Hi again

Would anyone be willing to help with the issue below?

Best wishes
Giuseppe

On 18/06/14 20:39, Giuseppe Gallone wrote:
> Hi
>
> I have a group of samples for which I'd like to ascertain if
> differential binding is detectable based on a "condition" binary
> variable (stored in DBA_CONDITION).
>
> However, these samples have been processed in 4 batches (each batch has
> at least 3 samples).  I would like to run a multifactorial analysis to
> regress the batch effect first, and then possibly analyse any remaining
> variance across the DBA_CONDITION contrast of interest.
>
> I understand it is possible to run such an analysis using blocking
> factors in dba.contrast. Let's say I store the 4 batch labels in
> DBA_TISSUE. The following:
>
> data = dba.contrast(data, categories=DBA_CONDITION, block=DBA_TISSUE)
>
> returns the following warning messages:
>
> Warning messages:
> 1: Blocking factor invalid for all contrasts:
> 2: No blocking values are present in both groups
>
> and data will not contain blocking factor information.
>
> Am I wrong in thinking that multiple contrasts can be used for the
> "block" argument? If I use only one contrast via mask (for example
> BATCH_1 VS !BATCH_1) this works correctly:
>
> data = dba.contrast(data, categories=DBA_CONDITION,
> block=data$masks$BATCH_1)
>
> however it will only block variance due to to this particular contrast,
> not all of them.
>
> A solution is, I suppose, do a differential analysis on all the
> contrasts one wishes to block, and identify the one which produces the
> highest number of variant sites:
>
> data = dba.contrast(data, categories=DBA_TISSUE)
> dba.analyze(data)
> ...
> #pick the contrast with the highest variance, eg BATCH_4, then do:
>
> data = dba.contrast(data, categories=DBA_CONDITION,
> block=data$masks$BATCH_4)
>
> However I was still wondering if there is a way to model all the
> variance due to the batch effects at once and the look at the residual
> variance for the real analysis.
>
> Thanks!
> Giuseppe
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives:
> http://news.gmane.org/gmane.science.biology.informatics.conductor



More information about the Bioconductor mailing list