[BioC] function for translation of ORFs

Robert Gentleman rgentlem at fhcrc.org
Tue Nov 11 22:06:46 CET 2008

I don't think that there is a specific one, but if your ORF is called y say,
then, using some bits from the Biostrings package, but mainly pure R, you can do

 a1 <- toupper(y)
 a2 <- substring(y, seq(1, nchar(y), by=3), seq(3, nchar(y), by=3))
 aa <- paste(RNA_GENETIC_CODE[x], collapse="")

 If your sequence is not RNA (but rather DNA), you can use dna2rna to first
"transcribe" it. There is a transcribe function, but be careful as you need to
know the orientation of the original sequence (usually it is reported as if
already transcribed - so reverse complemented, but if not there are functions in
Biostrings to do that.)

 Note that this vectorizes, so if you have lots of sequences put them all in one
character vector, and it should be reasonably fast.

 best wishes

Ana Conesa wrote:
> Dear list,
> Can someone indicate a R function for translating an open reading frame
> into a protein sequence?
> Thanks
> Ana
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives:
> http://news.gmane.org/gmane.science.biology.informatics.conductor

Robert Gentleman, PhD
Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M2-B876
PO Box 19024
Seattle, Washington 98109-1024
rgentlem at fhcrc.org

More information about the Bioconductor mailing list