[BioC] DNAStringsSet - remove multiple entries with same name
deepti anand
anand.deepti at outlook.com
Wed Sep 3 19:30:24 CEST 2014
Hi All,
I am trying to extract promoter sequences for a few ENTREZ IDS. The problem I am having is that there exists multiple transcripts for same gene. So this gives me multiple promoter sequences for same gene. Can I filter out the redundant promoter sequences?
Here is my code:
ids.ok = c("67665" ,"13198" ,"110196","15368")## Obtain coordinates of transcript #####>grl <- transcriptsBy(TxDb.Mmusculus.UCSC.mm10.knownGene, by="gene") [ids.ok]>promoter.seqs <- getPromoterSeq(grl,Mmusculus, upstream=1500,downstream=0)>promoter.seqs<- unlist(promoter.seqs)> promoter.seqs A DNAStringSet instance of length 8 width seq names [1] 1500 CTGCTGTAAAGTTACATTCCTGCCTAGAAATTTATATCGATTCTGCCGTCAGAA...GGAGGGAAGCGCCGGGCTGTGTCACGTGACGGGTGCGCCGGGCGTTGGCTCCTC 67665.67665[2] 1500 CTGCTGTAAAGTTACATTCCTGCCTAGAAATTTATATCGATTCTGCCGTCAGAA...GGAGGGAAGCGCCGGGCTGTGTCACGTGACGGGTGCGCCGGGCGTTGGCTCCTC 67665.67665[3] 1500 CTGCTGTAAAGTTACATTCCTGCCTAGAAATTTATATCGATTCTGCCGTCAGAA...GGAGGGAAGCGCCGGGCTGTGTCACGTGACGGGTGCGCCGGGCGTTGGCTCCTC 67665.67665[4] 1500 CTGCTGTAAAGTTACATTCCTGCCTAGAAATTTATATCGATTCTGCCGTCAGAA...GGAGGGAAGCGCCGGGCTGTGTCACGTGACGGGTGCGCCGGGCGTTGGCTCCTC 67665.67665[5] 1500 CAGCCCTAAAAGATGAAAGTCGCGACTTGCCCTGCCCCGCCCCAAAGGCTTCCC...CCCCCCCCCAGGAGGGGCCGGACAGCATAAAGGATACTCGCTCTCCGCTCTTGA 13198.13198[6] 1500 CACGTCGGCCTGCCTATCAGGGAGTCTACTGCCTTTTCCCTCAGTATGAGATAA...CCGTGGCATGCCGGGAGTCGTAGTTTTATATTTATGTTCTGCCTCCTGAGCCTG 110196.110196[7] 1500 CACGTCGGCCTGCCTATCAGGGAGTCTACTGCCTTTTCCCTCAGTATGAGATAA...CCGTGGCATGCCGGGAGTCGTAGTTTTATATTTATGTTCTGCCTCCTGAGCCTG 110196.110196[8] 1500 GTTAGTATTTAATATTTAAAGCTTGCTTCTAACTTGGCCCAAAATGTTGGAGTT...TGGGCGGCCACCACGTGACCCGCGTACTTAAAGGGCTGGCGCGGGCAGCTGCTC 15368.15368
In example above, there are four sequences for same gene '67665.67665'. How can I remove these entries?
I would appreciate any help
Dips
[[alternative HTML version deleted]]
More information about the Bioconductor
mailing list