[BioC] GWASTools: quasi-/perfect linear separation
Stephanie M. Gogarten
sdmorris at u.washington.edu
Wed Sep 10 20:20:53 CEST 2014
Hi Danica,
assocTestRegression will return an error code for SNPs that are
monomorphic in either cases or controls, but it seems that you have
found a case that we did not test for.
I consulted with Matt Conomos, who wrote this function, and he said the
following:
Since AA has a count of 0 in cases in the example given, and an error
was not returned, I would assume that both AB and BB are non-zero in
cases, but it would be nice to confirm this. Also, it would be nice to
know which allele is the minor allele (the function returns this), since
a recessive model is being fit. If the A allele is the minor allele,
then the recessive model collapses the AB and BB classes, and this could
lead to the separability issue. I may need to add in a check for this
when fitting dominant or recessive models.
Could you please provide the full output of assocTestRegression for the
SNPs where you see this problem? Also, include the output of
sessionInfo() so we know which version of GWASTools you are using.
Stephanie
On 9/4/14, 3:56 AM, Danica [guest] wrote:
> This is not really a question but more of a warning to other users.
>
> I have performed a regression analysis using the assocTestRegression function under three different models (dominant,recessive,additive). My data set contains ~3 million markers which have been filtered so that only SNPs with >= MAF of 10% are included. Please note that this filter was applied with both cases and controls as one big data set (i.e. I did not perform the filter for cases and controls separately).
>
> Once I have examined the results of the association under the recessive model, I noticed very large beta estimates (8-9). When I looked at the genotype counts, I realised that this was due to the fact that in some SNPs, there is perfect linear separation. In other words, the AA genotype has a count of 0 in cases and a count of 170 in controls, which leads to inflated estimates.
>
> I was surprised to find that the function does not throw a warning for this or drops the analysis for SNPs where this occurs.
>
> Regards,
> Danica
>
>
>
> -- output of sessionInfo():
>
>
>
> --
> Sent via the guest posting facility at bioconductor.org.
>
More information about the Bioconductor
mailing list