[BioC] Translating AB/BB/AA into a SNP with Illumina data

Lavinia Gordon lavinia.gordon at mcri.edu.au
Tue Jul 17 02:09:27 CEST 2012


Hi Stephanie

Thank you so much for the useful link and article.
With the annotation data I have something like this:
    Name VALUE IlmnStrand   SNP GenomeBuild Chr   MapInfo
1 200006    AB        TOP [A/G]          36   9 139046223

So I know AB/TOP/[A/G] means the Illumina data is reverse complement to
the source seq so the reference is T, so it is actually C/T (confirmed
by checking the location on the UCSC Genome Browser, where it is
annotated as dbSNP rs7469569).

But as far as I can see in the GWASTools package, there are no tools to
do this for me, i.e. using the VALUE/IlmnStrand/SNP info to determine
the SNP, and ideally the GenomeBuild + chrom data to confirm the dbSNP
info?

With thanks for your time,

Lavinia Gordon
Senior Research Officer
Quantitative Sciences Core, Bioinformatics

Murdoch Childrens Research Institute
The Royal Children's Hospital
Flemington Road Parkville Victoria 3052 Australia 
T 03 8341 6221
www.mcri.edu.au


-----Original Message-----
From: Stephanie M. Gogarten [mailto:sdmorris at u.washington.edu] 
Sent: Tuesday, 17 July 2012 2:19 AM
To: Lavinia Gordon
Cc: bioconductor at r-project.org
Subject: Re: Translating AB/BB/AA into a SNP with Illumina data

Hi Lavinia,

The GWASTools package was designed to work with this type of data.

You can download annotation for Illumina arrays from their website: 
https://icom.illumina.com/.  They now require that you register with
their site to download files.  Once you have logged in, click
"Downloads" in the menu on the left and then "Genotyping/LOH/CNV" in the
menu on the right, and look for the Human Omni1 Quad link.  The file
that you want is called HumanOmni1-Quad_v1-0_H_csv.zip, and looks like
this:

IlmnID,Name,IlmnStrand,SNP,AddressA_ID,AlleleA_ProbeSeq,AddressB_ID,Alle
leB_ProbeSeq,GenomeBuild,Chr,MapInfo,Ploidy,Species,Source,SourceVersion
,SourceStrand,SourceSeq,TopGenomicSeq,BeadSetID,Exp_Clusters,Intensity_O
nly,RefStrand
200006-0_T_R_1853021091,200006,TOP,[A/G],0060702346,AGACTGTGGATGAATAATGC
TGGTGAGTGTCTGGCCCTCGGGGAGGCCCA,,,37.1,9,139926402,diploid,Homo
sapiens,ILLUMINA,0,BOT,ACATGCCCCACTCAGCGCCACCCCCGTCCTCCCCTCCCAGGTTGCCTAG
CTGTCCCCAGC[T/C]TGGGCCTCCCCGAGGGCCAGACACTCACCAGCATTATTCATCCACAGTCTCCCAGG
ATCA,TGATCCTGGGAGACTGTGGATGAATAATGCTGGTGAGTGTCTGGCCCTCGGGGAGGCCCA[A/G]GC
TGGGGACAGCTAGGCAACCTGGGAGGGGAGGACGGGGGTGGCGCTGAGTGGGGCATGT,163,3,0,-

The "SNP" column tells you the A/B allele designation for a particular
SNP (format [A/B]) and the "IlmnStrand" column tells you whether that
SNP is on the TOP or BOT strand.  (See here for a useful article on how
to convert between different strand designations:
http://www.sciencedirect.com/science/article/pii/S0168952512000704)

Stephanie Gogarten
Research Scientist, Biostatistics
University of Washington


On 7/16/12 3:00 AM, bioconductor-request at r-project.org wrote:
> Message: 3
> Date: Mon, 16 Jul 2012 13:59:33 +1000
> From: "Lavinia Gordon"<lavinia.gordon at mcri.edu.au>
> To:<bioconductor at r-project.org>
> Subject: [BioC] Translating AB/BB/AA into a SNP with Illumina data 
> Message-ID:<87223629775F2049917889888F597633FD720F at murmx.mcri.edu.au>
> Content-Type: text/plain;	charset="us-ascii"
>
> Dear all,
>
> I am working with Illumina Human Omni1 Quad data.  I only have access 
> to processed data, e.g:
> ID_REF	VALUE	Score	Theta	R	B Allele Freq	Log R
Ratio
> 200006	AB	0.8273118	0.4800678	2.651576
> 0.5337635	0.1516016
>
> I would like to know what the SNP is at this position and wondered if 
> there are any components within the Bioconductor packages that can 
> deal with this data, taking into account the TOP/BTM strand approach 
> that Illumina uses.  I have previously had great success with crlmm, 
> but that was working from the raw IDAT files.
>
> With thanks for your time,
>
> Lavinia Gordon
> Senior Research Officer
> Quantitative Sciences Core, Bioinformatics
>
> Murdoch Childrens Research Institute
> The Royal Children's Hospital
> Flemington Road Parkville Victoria 3052 Australia T 03 8341 6221 
> www.mcri.edu.au


______________________________________________________________________
This email has been scanned by the Symantec Email Security.cloud
service.
For more information please visit http://www.symanteccloud.com

If you have any question, please contact MCRI IT Helpdesk for further
assistance.
______________________________________________________________________

______________________________________________________________________
This email has been scanned by the Symantec Email Security.cloud service.
For more information please visit http://www.symanteccloud.com



More information about the Bioconductor mailing list