[BioC] Interpretation of genotype calls from Oligo crlmm algorithm

jeremy wilson jeremy.wilson88 at gmail.com
Mon Apr 5 22:07:20 CEST 2010


Dear Bioconductors,

I have genotyped 8 samples from Affy6.0 platform using crlmm algorithm
from Oligo and I am facing difficulties in interpretation. I am trying
to match the gender from the phenodata with the SNP calls from crlmm
on X and Y chromosomes. I would appreciate any comments and help.
Given a SNP has A and B allele, is it that A is always a major allele
and B is always a minor allele or is it that the allele A and B are
alphabetically assigned; the first alphabet (A,T,G,C) gets assigned as
A allele and the next alphabet is assigned as B allele?

Another question is how do I differentiate between a AA genotype and
just A (no copy from the other parent). Are they both coded as AA and
given a call of 1 or 3?

Here is my code:

crlmmCalls <- readSummaries("calls", outDir)
conn <- db(pd.genomewidesnp.6)
sql <- "select man_fsetid,dbsnp_rs_id,chrom from featureSet where chrom='X' "
X_snps <- dbGetQuery(conn, sql)
tmp <- rownames(crlmmCalls) %in% X_snps$man_fsetid  ## Get crlmmCalls
for X snps for 8 samples
tmp <- crlmmCalls[tmp,]
tmp[tmp[,6] == 3 & tmp[,7] == 1,]   ##   2 male genotypes

This gives thousands of SNPs which is weird.


Considering if the SNP has a genotype of 1 its AA, if 2 its AB, if 3
its BB. How come the 2 men have a homozygous AA,BB calls for the X
chrom SNPs? Should not most of the SNPs (other than the homologous X
and Y regions) be heterozygous?.

What might be the problem?


> sessionInfo()
R version 2.10.1 (2009-12-14)
x86_64-pc-linux-gnu

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C
 [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8
 [5] LC_MONETARY=C              LC_MESSAGES=en_US.UTF-8
 [7] LC_PAPER=en_US.UTF-8       LC_NAME=C
 [9] LC_ADDRESS=C               LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base

other attached packages:
[1] pd.genomewidesnp.6_0.4.2 RSQLite_0.8-3            DBI_0.2-5
[4] oligo_1.10.3             preprocessCore_1.8.0     oligoClasses_1.8.0
[7] Biobase_2.6.1

loaded via a namespace (and not attached):
[1] affxparser_1.18.0  affyio_1.14.0      Biostrings_2.14.12 IRanges_1.4.13
[5] splines_2.10.1     tools_2.10.1


Thank you



More information about the Bioconductor mailing list