[BioC] normalization and batch correction across multiple project
adaikalavan.ramasamy at gmail.com
Tue Aug 26 17:17:05 CEST 2014
I had no response to this email that I sent last week. If anyone has any
input, I would greatly appreciate it. Thank you.
On Mon, Aug 18, 2014 at 1:11 PM, Adaikalavan Ramasamy <
adaikalavan.ramasamy at gmail.com> wrote:
> Dear all,
> I would like to appeal to the collective wisdom in this group on how best
> to solve this problem of normalization and batch correction.
> We are a service unit for an academic institute and we run several
> projects simultaneously. We use Illumina HT12-v4 microarrays which can
> take up to 12 different samples per chip. As we QC the data from one
> project, the RNA from failed samples can be repeated to include into chips
> from another project (rather than running partial chips to avoid wastage).
> Sometimes we include samples from other projects also. Here is a simple
> Chip No ScanDate Contents
> 1 1st July *12 samples from project A*
> 2 1st July *8 samples from project A* + 4 from
> project B
> 3 1st August 12 samples from Project B
> 4 1st August *1 sample from Project A* + 5 samples
> from B + 6 from project C
> What is the best way to prepare the final data for *project A*? One
> option is to do the following:
> 1. Pool chips 1, 2 and 4 together.
> 2. Remove failed samples
> 3. Remove samples from other projects.
> 4. Normalize using NEQC from limma
> 5. Correct for scan date using COMBAT from sva.
> The other option we considered is to omit step 3 (i.e. use other samples
> for normalization and COMBAT) and subset at the end.
> I feel this second option allows for better estimation of batch effects
> (especially in chip 4). However, sometimes project A and B can be quite
> different (e.g. samples derived from different tissues) which might mess up
> the normalization especially if we want to compare project A to B directly. We
> also considered nec() followed by normalizeBetweenArrays with "Tquantile"
> but I felt it was too complicated. Anything else to try?
> Thank you.
> Adaikalavan Ramasamy
> Senior Leadership Fellow in Bioinformatics
> Head of the Transcriptomics Core Facility
> Email: adaikalavan.ramasamy at ndm.ox.ac.uk
> Office: 01865 287 710
> Mob: 07906 308 465
[[alternative HTML version deleted]]
More information about the Bioconductor