[BioC] problems with pd.genomewidesnp.6

Sebastian Thieme thieme at mi.fu-berlin.de
Wed Dec 21 00:01:04 CET 2011


Hi at all,

I have some problems with the pd.genomewidesnp.6 package and I hope
some one can help me. The info with
get(objects("package:pd.genomewidesnp.6")) is

#Class........: AffySNPCNVPDInfo
#Manufacturer.: Affymetrix
#Genome Build.: HG19
#Chip Geometry: 2572 rows x  2680 columns

I want match the man_festid of each prob to one gene, therefore I look
in the gene_assoc part and call the gene with minimum distance to the
respective prob as corresponding gene. My commands for get the raw
informations are:

snp.f <- dbGetQuery(con6, "select * from featureSet")
snp.f <- snpfeature[,c("fsetid","man_fsetid","chrom","physical_pos","strand","cytoband","gene_assoc")]

cn.f <- dbGetQuery(con6, "select * from featureSetCNV")
cn.f <- cn.f[,c("fsetid","man_fsetid","chrom","chrom_start","strand","cytoband","gene_assoc")]

snp6.f <- rbind(snp.f,cn.f)

and process the gene_assoc part. Now the problem within the gene_assoc
part is that there are genes which are not on the same chromosome as
the respective probs e.g.

fsetid man_fsetid chrom physical_pos strand cytoband
650443  CN_618877    12     93793083      -      q22
                       gene_assoc
ENST00000358888 // upstream // 315610 // Hs.112553 // RPL41 // 6171
//ribosomal protein L41 /// ENST00000318066 // downstream // 8981 //
Hs.524630 // UBE2N // 7334 // ubiquitin-conjugating enzyme E2N (UBC13
homolog, yeast) /// NR_002212 // exon // 0 // --- // NUDT4P1 // 440672
// nudix (nucleoside diphosphate linked moiety X)-type motif 4
pseudogene 1 /// NM_199040 // CDS // 0 // Hs.506325 // NUDT4 // 11163
// nudix (nucleoside diphosphate linked moiety X)-type motif 4
///NM_019094 // CDS // 0 // Hs.506325 // NUDT4 // 11163 // nudix
(nucleoside diphosphate linked moiety X)-type motif 4

gene "NUDT4P1" is annotated on Chromosome 1 not 12 and this is only
one. An other example is

fsetid    man_fsetid chrom physical_pos strand cytoband
186938 SNP_A-4227519    12     31784081      -   p11.21

                                                     gene_assoc
ENST00000294419 // upstream // 14576 // Hs.10862 // AK3L1 // 205 //
adenylate kinase 3-like 1 /// ENST00000412352 // upstream // 16012 //
Hs.585084 // C12orf72 // 254013 // chromosome 12 open reading frame 72
/// NM_013410 // upstream // 14564 // Hs.10862 // AK3L1 // 205 //
adenylate kinase 3-like 1 /// NM_001135864 // upstream // 16012 //
Hs.585084 // C12orf72 // 254013 // chromosome 12 open reading frame 72

AK3L1 is annotated at chromosome 9 not 12. The corresponding ensembl
ID (ENST00000294419 ) is mapped to AK4-201 which is annotated on
chromosome 1 . This are only two examples there are a lot more. Can
some one help?


best regards

Basti



More information about the Bioconductor mailing list