[BioC] SNP6 data to VCF

Benilton Carvalho beniltoncarvalho at gmail.com
Mon Feb 13 21:30:23 CET 2012


Allow me to suggest to use, at least for now, the crlmm package to
call the genotypes on SNP 6.0 (also for SNP 5.0, in case you also have
data on that platform). Its implementation has significant
improvements over our initial crlmm implementation (present in oligo).

benilton

On 13 February 2012 19:36, Sean Davis <sdavis2 at mail.nih.gov> wrote:
> On Mon, Feb 13, 2012 at 2:28 PM, Vincent Carey
> <stvjc at channing.harvard.edu> wrote:
>>
>>
>> On Mon, Feb 13, 2012 at 2:13 PM, Sean Davis <sdavis2 at mail.nih.gov> wrote:
>>>
>>> Hi, all.
>>>
>>> I'm a little rusty on my oligo array software tools.  I'm interested
>>> in taking Affymetrix SNP6 data to VCF format.  To do that, I am going
>>> to need to:
>>>
>>> 1.  Call SNPs
>>> 2.  Determine strand and reference allele for each SNP on the array
>>> 3.  Assign the correct alleles to each SNP for each sample
>>
>>
>> for 2 and 3 pd.genomewidesnp.6 has the metadata
>>
>>> con  = pd.genomewidesnp.6 at getdb()
>>> dbListTables(con)
>>  [1] "featureSet"        "featureSetCNV"     "fragmentLength"
>>  [4] "fragmentLengthCNV" "pmfeature"         "pmfeatureCNV"
>>  [7] "sequence"          "sequenceCNV"       "sqlite_stat1"
>> [10] "table_info"
>>
>>> ss = dbGetQuery(con, "select * from featureSet limit 5")
>>> ss
>>   fsetid    man_fsetid affy_snp_id dbsnp_rs_id chrom physical_pos strand
>> 1      1 SNP_A-2131660          NA   rs2887286     1      1156131      0
>> 2      2 SNP_A-1967418          NA   rs1496555     1      2234251      0
>> 3      3 SNP_A-1969580          NA  rs41477744     1      2329564      0
>> 4      4 SNP_A-4263484          NA   rs3890745     1      2553624      0
>> 5      5 SNP_A-1978185          NA  rs10492936     1      2936870      1
>>   cytoband allele_a allele_b
>> 1   p36.33        C        T
>> 2   p36.33        A        G
>> 3   p36.32        A        G
>> 4   p36.32        C        T
>> 5   p36.32        C        T
>
> Told you I was rusty.  Thanks, Vince.
>
> Sean
>
>
>>>
>>> 4.  Write out the VCF file with the correct genotypes (on the positive
>>> strand, reference allele correctly specified)
>>>
>>> What is the best way to do steps 1-3?  I'll deal with step 4 since I
>>> don't think that has been implemented directly.
>>>
>>> Thanks,
>>> Sean
>>>
>>> _______________________________________________
>>> Bioconductor mailing list
>>> Bioconductor at r-project.org
>>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>>> Search the archives:
>>> http://news.gmane.org/gmane.science.biology.informatics.conductor
>>
>>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor



More information about the Bioconductor mailing list