[R] Need help with cluster analysis of amino acid sequences

Peter Hornbeck hornbeck1 at adelphia.net
Sun Sep 7 19:28:02 CEST 2003

I am just starting to use R and am wanting to use the cluster algorithm for
analyzing sequences of amino acids for similarities.  The input will be
lists of short sequences of 8-15 amino acids.

Let me give you a feel for the sort of data I am interested in.

 Amino acids can be classified by a number of different parameters: e.g.,
charge and hydrophobicity.  Each of these qualities could be described by a
numerical assignment: charge (perhaps as either 0 or 1), and hydrophobicity
(perhaps as a continuum from 0 to 1).  The point of the analysis is to
cluster those sequences that have similar properties at different positions
along the sequence.

My question is: is there a user group of biologists that may be able to
provide tips about how to proceed, or perhaps who already have developed
algorithms that can be applied/modified to the sort of analysis I need?

Or does anyone have suggestions of other on-line resources that might be

Thanks, Peter

Peter Hornbeck
Magnolia, MA
(978) 5264867

