[BioC] Strand information for dbSNP packages

Alex Gutteridge alexg at ruggedtextile.com
Tue Feb 28 12:03:11 CET 2012


I notice the GRanges returned by the dbSNP packages have strand '*'. 
Does anyone know how safe am I in assuming that the variant alleles also 
given by the package actually correspond to the '+' strand?

I ask this in the context of trying to use predictCoding in the 
VariantAnnotations package to find coding SNPs. For SNPs in genes on the 
'-' strand I have found that I have to complement the alleles given by 
dbSNP to get the correct result. I just want to make sure that assuming 
the alleles are from the '+' strand is a reasonable assumption in the 
vast majority (>99%) of cases.

I realise from my reading of the SNPlocs.Hsapiens.dbSNP.20110815 manual 
that some SNPs will be incorrect anyway (it mentions ~0.1% of SNPs not 
mapping to the reference at all), that level of failure is acceptable, 
but anything higher would be a worry.

-- 
Alex Gutteridge



More information about the Bioconductor mailing list