[BioC] problems with pd.genomewidesnp.6

MacDonald, James jmacdon at med.umich.edu
Wed Dec 21 14:56:44 CET 2011


Hi Sebastian,

On 12/20/11 6:01 PM, Sebastian Thieme wrote:
> Hi at all,
>
> I have some problems with the pd.genomewidesnp.6 package and I hope
> some one can help me. The info with
> get(objects("package:pd.genomewidesnp.6")) is
>
> #Class........: AffySNPCNVPDInfo
> #Manufacturer.: Affymetrix
> #Genome Build.: HG19
> #Chip Geometry: 2572 rows x  2680 columns
>
> I want match the man_festid of each prob to one gene, therefore I look
> in the gene_assoc part and call the gene with minimum distance to the
> respective prob as corresponding gene. My commands for get the raw
> informations are:
>
> snp.f<- dbGetQuery(con6, "select * from featureSet")
> snp.f<- snpfeature[,c("fsetid","man_fsetid","chrom","physical_pos","strand","cytoband","gene_assoc")]
>
> cn.f<- dbGetQuery(con6, "select * from featureSetCNV")
> cn.f<- cn.f[,c("fsetid","man_fsetid","chrom","chrom_start","strand","cytoband","gene_assoc")]
>
> snp6.f<- rbind(snp.f,cn.f)
>
> and process the gene_assoc part. Now the problem within the gene_assoc
> part is that there are genes which are not on the same chromosome as
> the respective probs e.g.
>
> fsetid man_fsetid chrom physical_pos strand cytoband
> 650443  CN_618877    12     93793083      -      q22
>                         gene_assoc
> ENST00000358888 // upstream // 315610 // Hs.112553 // RPL41 // 6171
> //ribosomal protein L41 /// ENST00000318066 // downstream // 8981 //
> Hs.524630 // UBE2N // 7334 // ubiquitin-conjugating enzyme E2N (UBC13
> homolog, yeast) /// NR_002212 // exon // 0 // --- // NUDT4P1 // 440672
> // nudix (nucleoside diphosphate linked moiety X)-type motif 4
> pseudogene 1 /// NM_199040 // CDS // 0 // Hs.506325 // NUDT4 // 11163
> // nudix (nucleoside diphosphate linked moiety X)-type motif 4
> ///NM_019094 // CDS // 0 // Hs.506325 // NUDT4 // 11163 // nudix
> (nucleoside diphosphate linked moiety X)-type motif 4
>
> gene "NUDT4P1" is annotated on Chromosome 1 not 12 and this is only
> one. An other example is

In what build is that true? UCSC claims that NUDT4 and NUDT4P1 are 
overlapping, on chr12 (hg19).

Anyway, the larger point here is a discussion of what a SNP is, and how 
they are localized. Essentially, a SNP is a single base that has been 
found to vary with a certain frequency in a population. They are 
localized by the flanking sequence, which means that in the case of a 
pseudogene (which may or may not be on the same chromosome), you will 
see the same flanking sequence and cannot reliably say where the SNP is 
really located.

Since DNA chips work by binding to the SNP and its flanking sequence, 
you cannot say whether you have measured the gene, the pseudogene, or 
some combination thereof.

Listing all possibilities for the SNP location is therefore not a 
'problem', it just reflects our lack of precision.

Best,

Jim


> fsetid    man_fsetid chrom physical_pos strand cytoband
> 186938 SNP_A-4227519    12     31784081      -   p11.21
>
>                                                       gene_assoc
> ENST00000294419 // upstream // 14576 // Hs.10862 // AK3L1 // 205 //
> adenylate kinase 3-like 1 /// ENST00000412352 // upstream // 16012 //
> Hs.585084 // C12orf72 // 254013 // chromosome 12 open reading frame 72
> /// NM_013410 // upstream // 14564 // Hs.10862 // AK3L1 // 205 //
> adenylate kinase 3-like 1 /// NM_001135864 // upstream // 16012 //
> Hs.585084 // C12orf72 // 254013 // chromosome 12 open reading frame 72
>
> AK3L1 is annotated at chromosome 9 not 12. The corresponding ensembl
> ID (ENST00000294419 ) is mapped to AK4-201 which is annotated on
> chromosome 1 . This are only two examples there are a lot more. Can
> some one help?
>
>
> best regards
>
> Basti
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor

-- 
James W. MacDonald, M.S.
Biostatistician
Douglas Lab
University of Michigan
Department of Human Genetics
5912 Buhl
1241 E. Catherine St.
Ann Arbor MI 48109-5618
734-615-7826

**********************************************************
Electronic Mail is not secure, may not be read every day, and should not be used for urgent or sensitive issues 



More information about the Bioconductor mailing list