[BioC] Aligning short reads to a single gene

Wei Shi shi at wehi.EDU.AU
Fri Feb 8 11:09:41 CET 2013


An alternative way will be to use an aligner to align your reads to your reporter gene. You'll need to build an index first using the sequence of your reporter gene and then perform the alignment. But the alignment should be very fast because your 'reference sequence' is very short. The alignment will look after sequencing errors in your reads as well.

Wei

On Feb 8, 2013, at 7:30 PM, Ugo Borello wrote:

> Dear all,
> a sanity check of my RNA-Seq experiment is driving me crazy.
> 
> I want to verify that the GFP reporter gene is expressed in my control
> samples. This reporter gene, of course, is not present in the mouse
> reference genome but only in my transgenic mice.
> Therefore, I thought to align the short reads of my RNA-Seq fastq files to
> the GFP gene sequence to verify its mRNA presence in my samples.
> 
> 
> I came out with this naïve solution:
> 
> library(ShortRead)
> 
> ## read fastq file
> reads<- readFastq('filename.fastq')
> 
> ## get the sequences
> seq<- sread(reads)
> 
> ## keep only the sequences without 'N'
> alf <- alphabetFrequency( sread(reads))
> clean<- seq[alf[,"N"] == 0]
> 
> ## make a dictionary of those sequences
> pdict0<- PDict(clean)
> 
> ## match the dictionary of my sequences to the GFP gene (~800 bp)
> myseq<-DNAString("ATG...TAA")   ## here the GFP gene sequence
> mysearch <- matchPDict(pdict0, myseq, max.mismatch=0)   ## exact match
> count_index <- sum(countIndex(mysearch))
> count_index   ## the total number of alignments
> 
> 
> 
> Any suggestion for a nicer and more efficient solution?
> 
> Thank you in advance for your help
> 
> Ugo
> 
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor

______________________________________________________________________
The information in this email is confidential and intend...{{dropped:6}}



More information about the Bioconductor mailing list