[BioC] Translating AB/BB/AA into a SNP with Illumina data

Stephanie M. Gogarten sdmorris at u.washington.edu
Mon Jul 16 18:19:20 CEST 2012


Hi Lavinia,

The GWASTools package was designed to work with this type of data.

You can download annotation for Illumina arrays from their website: 
https://icom.illumina.com/.  They now require that you register with 
their site to download files.  Once you have logged in, click 
"Downloads" in the menu on the left and then "Genotyping/LOH/CNV" in the 
menu on the right, and look for the Human Omni1 Quad link.  The file 
that you want is called HumanOmni1-Quad_v1-0_H_csv.zip, and looks like this:

IlmnID,Name,IlmnStrand,SNP,AddressA_ID,AlleleA_ProbeSeq,AddressB_ID,AlleleB_ProbeSeq,GenomeBuild,Chr,MapInfo,Ploidy,Species,Source,SourceVersion,SourceStrand,SourceSeq,TopGenomicSeq,BeadSetID,Exp_Clusters,Intensity_Only,RefStrand
200006-0_T_R_1853021091,200006,TOP,[A/G],0060702346,AGACTGTGGATGAATAATGCTGGTGAGTGTCTGGCCCTCGGGGAGGCCCA,,,37.1,9,139926402,diploid,Homo 
sapiens,ILLUMINA,0,BOT,ACATGCCCCACTCAGCGCCACCCCCGTCCTCCCCTCCCAGGTTGCCTAGCTGTCCCCAGC[T/C]TGGGCCTCCCCGAGGGCCAGACACTCACCAGCATTATTCATCCACAGTCTCCCAGGATCA,TGATCCTGGGAGACTGTGGATGAATAATGCTGGTGAGTGTCTGGCCCTCGGGGAGGCCCA[A/G]GCTGGGGACAGCTAGGCAACCTGGGAGGGGAGGACGGGGGTGGCGCTGAGTGGGGCATGT,163,3,0,-

The "SNP" column tells you the A/B allele designation for a particular 
SNP (format [A/B]) and the "IlmnStrand" column tells you whether that 
SNP is on the TOP or BOT strand.  (See here for a useful article on how 
to convert between different strand designations:
http://www.sciencedirect.com/science/article/pii/S0168952512000704)

Stephanie Gogarten
Research Scientist, Biostatistics
University of Washington


On 7/16/12 3:00 AM, bioconductor-request at r-project.org wrote:
> Message: 3
> Date: Mon, 16 Jul 2012 13:59:33 +1000
> From: "Lavinia Gordon"<lavinia.gordon at mcri.edu.au>
> To:<bioconductor at r-project.org>
> Subject: [BioC] Translating AB/BB/AA into a SNP with Illumina data
> Message-ID:<87223629775F2049917889888F597633FD720F at murmx.mcri.edu.au>
> Content-Type: text/plain;	charset="us-ascii"
>
> Dear all,
>
> I am working with Illumina Human Omni1 Quad data.  I only have access to
> processed data, e.g:
> ID_REF	VALUE	Score	Theta	R	B Allele Freq	Log R Ratio
> 200006	AB	0.8273118	0.4800678	2.651576
> 0.5337635	0.1516016
>
> I would like to know what the SNP is at this position and wondered if
> there are any components within the Bioconductor packages that can deal
> with this data, taking into account the TOP/BTM strand approach that
> Illumina uses.  I have previously had great success with crlmm, but that
> was working from the raw IDAT files.
>
> With thanks for your time,
>
> Lavinia Gordon
> Senior Research Officer
> Quantitative Sciences Core, Bioinformatics
>
> Murdoch Childrens Research Institute
> The Royal Children's Hospital
> Flemington Road Parkville Victoria 3052 Australia
> T 03 8341 6221
> www.mcri.edu.au



More information about the Bioconductor mailing list