[BioC] kyte and doolittle hydropathy values/plot - sub + sliding window mean

Matthew Hannah Hannah at mpimp-golm.mpg.de
Thu Dec 2 16:25:36 CET 2004


Hi,

This algorithm calculates the hydropathy of proteins. I've found
web-based versions but they all return a graph not values. I was
wondering if there was an R/BioC inplementation of it, or something
similar.

Alternatively I'm trying to do something similar myself but have got
stuck with no obvious help in archives.

My protein sequences will be read in as fasta strings and converted to a
character vector.
x <- "MSETNKNAFQ"
strsplit(x,"")

I have the scores for the 20 amino acids (letters in column 2 of a
table), and the scores from -4.5 to 4.5 in another column. I want to
replace the letters with the corresponding score.

I've tried using sub and gsub, but can't work how to replace them all at
one. But doing them individually
score.assign <- function(x) {
x <- gsub(scores[1,2],scores[1,3],x)
x <- gsub(scores[2,2],scores[2,3],x)
...
}

returns this
"c(\"4.2\", \"-0.4\", \"-4.5\", \"4.5\")"
which I can't work out how to convert to a usable vector.

Once I have my numeric vector I want to calculate a sliding (hopefully
using different window sizes) mean of AAs 1:12, 2:13..etc.

Finally, this would be best if I could import a large number of
sequences from fasta format to analyse at once. I could not see any
obvious way of handling sequence data easily in BioC, have I just missed
something.

Thanks alot,

Matt



More information about the Bioconductor mailing list