[BioC] GWASTools: quasi-/perfect linear separation
Danica [guest]
guest at bioconductor.org
Thu Sep 4 12:56:32 CEST 2014
This is not really a question but more of a warning to other users.
I have performed a regression analysis using the assocTestRegression function under three different models (dominant,recessive,additive). My data set contains ~3 million markers which have been filtered so that only SNPs with >= MAF of 10% are included. Please note that this filter was applied with both cases and controls as one big data set (i.e. I did not perform the filter for cases and controls separately).
Once I have examined the results of the association under the recessive model, I noticed very large beta estimates (8-9). When I looked at the genotype counts, I realised that this was due to the fact that in some SNPs, there is perfect linear separation. In other words, the AA genotype has a count of 0 in cases and a count of 170 in controls, which leads to inflated estimates.
I was surprised to find that the function does not throw a warning for this or drops the analysis for SNPs where this occurs.
Regards,
Danica
-- output of sessionInfo():
--
Sent via the guest posting facility at bioconductor.org.
More information about the Bioconductor
mailing list