[BioC] edgeR GLM to adjust for batch effect

Ryan Basom rbasom at fhcrc.org
Thu Mar 27 22:51:12 CET 2014


Thanks for this advice.  I have a follow up question though:  As 
described in the edgeR User's Guide pertaining to adjusting for batch 
effects "In this type of analysis, the treatments are compared only 
within each batch. The analysis is corrected for baseline differences 
between the batches."  If some of the batches don't have samples for say 
both treatments, how is this compensated for?  Though this isn't ideal, 
I'd like to get a better sense of what's going on in this scenario.

Thanks,
Ryan


On 03/26/2014 04:36 PM, Ryan C. Thompson wrote:
> You don't necessarily need every condition in every batch for the 
> comparison to be effective, but having only one batch in common is not 
> good. If I understand correctly, batch 3 would be the dominant 
> contributor to the estimates of fold changes in the comparisons that 
> you care about, since any other change would be mostly absorbed into 
> the batch effects. I think the first step you should take is to fit 
> the full model with conditions and batch effect and find out whether 
> the batch effects appear to be significant enough to warrant inclusion 
> in the model, and if not, then drop them.
>
> -Ryan
>
> On Wed 26 Mar 2014 03:47:42 PM PDT, Ryan Basom [guest] wrote:
>>
>>
>> Hi,
>>
>> I'd like to use a GLM in edgeR to adjust for a batch effect, though 
>> only one of my four batches has samples from both groups in the 
>> comparisons that I'd like to conduct (pos-nc & neg-nc):
>>
>> 1 2 3 4
>> pos 3 5 9 0
>> neg 5 4 7 0
>> nc 0 0 5 8
>>
>> I suspect that using a GLM in edgeR to adjust for batch will only 
>> work properly if there's representation of both groups from a given 
>> comparison in every batch, though would like to know if this is 
>> otherwise. I see a batch effect using PVCA on just the pos and neg 
>> samples, and would like to try to adjust for it somehow. Please advise.
>>
>> Thanks,
>> Ryan
>>
>>
>>
>>
>>
>>
>> -- output of sessionInfo():
>>
>> R version 3.0.3 (2014-03-06)
>> Platform: x86_64-pc-linux-gnu (64-bit)
>>
>> locale:
>> [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C LC_TIME=en_US.UTF-8 
>> LC_COLLATE=en_US.UTF-8
>> [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 
>> LC_PAPER=en_US.UTF-8 LC_NAME=C
>> [9] LC_ADDRESS=C LC_TELEPHONE=C LC_MEASUREMENT=en_US.UTF-8 
>> LC_IDENTIFICATION=C
>>
>> attached base packages:
>> [1] splines parallel stats graphics grDevices utils datasets methods 
>> base
>>
>> other attached packages:
>> [1] pvca_1.2.0 beadChipCoreTools_0.49 beadAnno_1.0 lumi_2.14.1
>> [5] Biobase_2.22.0 BiocGenerics_0.8.0 genefilter_1.44.0 
>> arrayQualityMetrics_3.18.0
>> [9] edgeR_3.4.2 limma_3.18.12
>>
>> -- 
>> Sent via the guest posting facility at bioconductor.org.
>>
>> _______________________________________________
>> Bioconductor mailing list
>> Bioconductor at r-project.org
>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>> Search the archives: 
>> http://news.gmane.org/gmane.science.biology.informatics.conductor



More information about the Bioconductor mailing list