[BioC] matchPDict with mismatches allowed appears to drop names

Ian Henry henry at mpi-cbg.de
Tue Aug 2 11:24:57 CEST 2011


I have a question regarding the inheritance of the names attribute  
when using matchPDict.

If I use matchPDict as follows:

#Get transcript information
 > hg19txdb <- makeTranscriptDbFromUCSC(genome = "hg19", tablename =  
 > hg19_tx <- extractTranscriptsFromGenome(Hsapiens, hg19txdb)

#Create DNAStringSet with names associated with each probe
 > probeset <- DNAStringSet(probelist$sequence)
 > names(probeset)<-probelist$probenames

#Create PDict object and match against human transcript 14 (I know it  
should match)
 > ps_pdict<-PDict(probeset)
 > txmatches <- matchPDict(ps_pdict, hg19_tx[[14]])

this compares the probes in ps_pdict to transcript 14 in hg19 and gives:

     start end width           names
[1]   749 773    25  HW:6
[2]   569 593    25 HW:16
[3]   804 828    25 HW:26
[4]   757 781    25 HW:36

which works :)

However, if I search allowing for mismatches then the names appear to  
be lost:

 > ps_pdict1<-PDict(probeset, max.mismatch=1)
 > txmatches1 <- matchPDict(ps_pdict1, hg19_tx[[14]], max.mismatch=1,  
 > unlist(txmatches1)

IRanges of length 4
     start end width
[1]   749 773    25
[2]   569 593    25
[3]   804 828    25
[4]   757 781    25

The result of matchPDict is a MIndex object that I named txmatches  
with exact matches, and txmatches1 with 1 mismatch
 > names(txmatches)                #gives character vector containing  
probe names
 > names(txmatches1)              #returns NULL

So it appears the names are not inherited.  I tried to added them  
manually to my MIndex object

but I get Error:
attempt to modify the names of a ByPos_MIndex instance

Therefore I'm not sure how to keep my probe names associated with the  
Transcript match, which is important for inexact matching.

Any help would be greatly appreciated,




R version 2.13.0 beta (2011-03-31 r55221)
Platform: i386-apple-darwin9.8.0/i386 (32-bit)

[1] C/UTF-8/C/C/C/C

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base

other attached packages:
[1] plyr_1.5.2                          
[3] BSgenome_1.19.5                    Biostrings_2.19.17
[5] GenomicFeatures_1.3.15             GenomicRanges_1.3.31
[7] IRanges_1.9.28

loaded via a namespace (and not attached):
[1] Biobase_2.11.10     DBI_0.2-5           RCurl_1.5-0
[4] RSQLite_0.9-4       XML_3.2-0           biomaRt_2.7.1
[7] rtracklayer_1.11.12 tools_2.13.0

More information about the Bioconductor mailing list