[BioC] extract introns

Yating Cheng yating.cheng at charite.de
Tue Nov 15 23:57:31 CET 2011

Hi Steve,

I am working with Apis mellifera.

Actually I am also trying to use the alignment function to get the intron
sequence.which is much easier for me to understand.  But there are some
problem with that also.


# set working directory

# load Biostrings library

# load data

# create object for storage of extracted introns

# loop over all genes
for(i in 1:length(genebody[,1])){

	tmp.id<-as.vector(genebody[i,1])											# get gene id
	tmp.subject<-as.vector(genebody[i,2])										# get gene sequence
	tmp.exons<-exons[which(exons[,1]==tmp.id),]									# get exons of the
selected genes

	tmp.pattern<-as.vector(tmp.exons[,3])										# define exons as patterns
for alignment
subject=tmp.subject,type="local")			# align all exons pairwise to gene
	tmp.start<-start(subject(tmp.align))										# vector of all alignment
	tmp.end<-end(subject(tmp.align))											# vector of all alignment ends

	for(ex in 1:(length(tmp.end)-1)){											# extract introns





# define useful names for columns

# write output
write.table(introns,"Introns.txt", sep="\t", quote=FALSE, row.names=FALSE)

The problem is the unvalid arguments in the substr....

Best Wishes


> HI Yating,
> On Tue, Nov 15, 2011 at 4:48 PM, Yating Cheng <yating.cheng at charite.de>
> wrote:
>> Hi, steve
>> Thanks for your answer. About converting the names of chromosome, it is
>> kind of difficult.
>> I need the intron sequences for every gene, but the chromosome names of
>> some genes I got from biomart are the same.
> Sorry, I'm confused.
> What organism are you working with?
> Can you construct a TranscriptDb for it using the GenomicFeatures
> package (you'll have to read through the GenomicFeatures vignette).
>> I am not sure about how can I get specific genes from BS genome, and
>> match
>> them to gene-IDs.
> BSgenome packages do not have gene specific information, they only
> contain sequence information for different organisms.
> The fact that you can use a BSgenome.* package for your organism of
> interest suggests that you should be able to build a TranscriptDb for
> it, so that's good news ...
> -steve
> --
> Steve Lianoglou
> Graduate Student: Computational Systems Biology
>  | Memorial Sloan-Kettering Cancer Center
>  | Weill Medical College of Cornell University
> Contact Info: http://cbio.mskcc.org/~lianos/contact

More information about the Bioconductor mailing list