[BioC] Translating AB/BB/AA into a SNP with Illumina data

Stephanie M. Gogarten sdmorris at u.washington.edu
Tue Jul 17 18:00:03 CEST 2012


Hi Lavinia,

Sorry my response was not more clear - I wasn't sure if you wanted just 
information on that particular SNP or how to work with that data format 
in general.  The functions in GWASTools operate on genotype data in A/B 
format and BAlleleFreq/LogRRatio data, but it doesn't help you set up 
the SNP annotation, which is why I directed you to the Illumina file. 
To query dbSNP from within R, you might try biomaRt.  You can also look 
at rtracklayer, which interacts with the UCSC genome browser.

Stephanie

On 7/16/12 5:09 PM, Lavinia Gordon wrote:
> Hi Stephanie
>
> Thank you so much for the useful link and article.
> With the annotation data I have something like this:
>      Name VALUE IlmnStrand   SNP GenomeBuild Chr   MapInfo
> 1 200006    AB        TOP [A/G]          36   9 139046223
>
> So I know AB/TOP/[A/G] means the Illumina data is reverse complement to
> the source seq so the reference is T, so it is actually C/T (confirmed
> by checking the location on the UCSC Genome Browser, where it is
> annotated as dbSNP rs7469569).
>
> But as far as I can see in the GWASTools package, there are no tools to
> do this for me, i.e. using the VALUE/IlmnStrand/SNP info to determine
> the SNP, and ideally the GenomeBuild + chrom data to confirm the dbSNP
> info?
>
> With thanks for your time,
>
> Lavinia Gordon
> Senior Research Officer
> Quantitative Sciences Core, Bioinformatics
>
> Murdoch Childrens Research Institute
> The Royal Children's Hospital
> Flemington Road Parkville Victoria 3052 Australia
> T 03 8341 6221
> www.mcri.edu.au
>
>
> -----Original Message-----
> From: Stephanie M. Gogarten [mailto:sdmorris at u.washington.edu]
> Sent: Tuesday, 17 July 2012 2:19 AM
> To: Lavinia Gordon
> Cc: bioconductor at r-project.org
> Subject: Re: Translating AB/BB/AA into a SNP with Illumina data
>
> Hi Lavinia,
>
> The GWASTools package was designed to work with this type of data.
>
> You can download annotation for Illumina arrays from their website:
> https://icom.illumina.com/.  They now require that you register with
> their site to download files.  Once you have logged in, click
> "Downloads" in the menu on the left and then "Genotyping/LOH/CNV" in the
> menu on the right, and look for the Human Omni1 Quad link.  The file
> that you want is called HumanOmni1-Quad_v1-0_H_csv.zip, and looks like
> this:
>
> IlmnID,Name,IlmnStrand,SNP,AddressA_ID,AlleleA_ProbeSeq,AddressB_ID,Alle
> leB_ProbeSeq,GenomeBuild,Chr,MapInfo,Ploidy,Species,Source,SourceVersion
> ,SourceStrand,SourceSeq,TopGenomicSeq,BeadSetID,Exp_Clusters,Intensity_O
> nly,RefStrand
> 200006-0_T_R_1853021091,200006,TOP,[A/G],0060702346,AGACTGTGGATGAATAATGC
> TGGTGAGTGTCTGGCCCTCGGGGAGGCCCA,,,37.1,9,139926402,diploid,Homo
> sapiens,ILLUMINA,0,BOT,ACATGCCCCACTCAGCGCCACCCCCGTCCTCCCCTCCCAGGTTGCCTAG
> CTGTCCCCAGC[T/C]TGGGCCTCCCCGAGGGCCAGACACTCACCAGCATTATTCATCCACAGTCTCCCAGG
> ATCA,TGATCCTGGGAGACTGTGGATGAATAATGCTGGTGAGTGTCTGGCCCTCGGGGAGGCCCA[A/G]GC
> TGGGGACAGCTAGGCAACCTGGGAGGGGAGGACGGGGGTGGCGCTGAGTGGGGCATGT,163,3,0,-
>
> The "SNP" column tells you the A/B allele designation for a particular
> SNP (format [A/B]) and the "IlmnStrand" column tells you whether that
> SNP is on the TOP or BOT strand.  (See here for a useful article on how
> to convert between different strand designations:
> http://www.sciencedirect.com/science/article/pii/S0168952512000704)
>
> Stephanie Gogarten
> Research Scientist, Biostatistics
> University of Washington
>
>
> On 7/16/12 3:00 AM, bioconductor-request at r-project.org wrote:
>> Message: 3
>> Date: Mon, 16 Jul 2012 13:59:33 +1000
>> From: "Lavinia Gordon"<lavinia.gordon at mcri.edu.au>
>> To:<bioconductor at r-project.org>
>> Subject: [BioC] Translating AB/BB/AA into a SNP with Illumina data
>> Message-ID:<87223629775F2049917889888F597633FD720F at murmx.mcri.edu.au>
>> Content-Type: text/plain;	charset="us-ascii"
>>
>> Dear all,
>>
>> I am working with Illumina Human Omni1 Quad data.  I only have access
>> to processed data, e.g:
>> ID_REF	VALUE	Score	Theta	R	B Allele Freq	Log R
> Ratio
>> 200006	AB	0.8273118	0.4800678	2.651576
>> 0.5337635	0.1516016
>>
>> I would like to know what the SNP is at this position and wondered if
>> there are any components within the Bioconductor packages that can
>> deal with this data, taking into account the TOP/BTM strand approach
>> that Illumina uses.  I have previously had great success with crlmm,
>> but that was working from the raw IDAT files.
>>
>> With thanks for your time,
>>
>> Lavinia Gordon
>> Senior Research Officer
>> Quantitative Sciences Core, Bioinformatics
>>
>> Murdoch Childrens Research Institute
>> The Royal Children's Hospital
>> Flemington Road Parkville Victoria 3052 Australia T 03 8341 6221
>> www.mcri.edu.au
>
>
> ______________________________________________________________________
> This email has been scanned by the Symantec Email Security.cloud
> service.
> For more information please visit http://www.symanteccloud.com
>
> If you have any question, please contact MCRI IT Helpdesk for further
> assistance.
> ______________________________________________________________________
>
> ______________________________________________________________________
> This email has been scanned by the Symantec Email Security.cloud service.
> For more information please visit http://www.symanteccloud.com
> ______________________________________________________________________
>



More information about the Bioconductor mailing list