[R] More than on loop??

jim holtman jholtman at gmail.com
Sat Jan 30 22:36:45 CET 2010


One quick comment about looking at the graphs you provided, why aren't
all 8 columns the same height given that each column should have the
same number of amino acids in them.  FOr the cleaved case is it 114
and even after normalizing, the column sums should be the same -- 100.
 Are the graphs really correct?

On Sat, Jan 30, 2010 at 3:38 PM, che <fadialnaji at live.com> wrote:
>
> Here is the the written instruction as i managed to get it from my professor,
> the graphs and data are attached:
>
> The graph below shows an example of the expected outcome of this course
> work. You may
> procude a better one. The graph for analysing the motifs of a set of
> peptides is designed
> this way
>
> • the graph is composed of columns of coloured rectangles
>
> • a column corresponding to a residue from “N4” to “C4”. Note that eight
> residues
> are denoted by “N4”, “N3”, “N2”, “N1”, “C1”, “C2”, “C3”, “C4”. “N4” means
> the
> 4th flanking residue of a cleavage site on the N-terminal side and “C3”
> means the 3rd
> flanking residue of a cleavage site on the C-terminal side. The cleavage
> occurs between
> “N1” and “C1”.
>
> • there are 20 rectangles in each column corresponding to 20 amino acids. A
> rectangular
> of an amino acid has a larger height if the corresponding amino acid has a
> larger
> frequency to occur at the residue, for instance, the rectangular of “S” in
> the first
> column for the cleaved peptides.
>
> • a letter of an amino acid is printed within a rectangular. Its font size
> depends on the
> frequency of the amino acid in a residue.
>
> In your package, you need to have the following functions
> 1. set a colour map using the following or your own design
> • colmap<-c("#FFFFFF", "#FFFFCC", "#FFFF99", "#FFFF66", "#FFFF33",
> "#FFFF00", "#FFCCFF", "#FFCCCC", "#FFCC99", "#FFCC66", "#FFCC33",
> "#FFCC00", "#FF99FF", "#FF99CC", "#FF9999", "#FF9966", "#FF9933",
> "#FF9900", "#FF33FF", "#FF33CC")
> 2. define a set of amino acids using string or other format if you want
> • amino.acid<-"ACDEFGHIKLMNPQRSTVWY"
>
> 3. read in the given peptide data (“hiv.dat”) using
> read.table(‘‘../data/hiv.dat’’,header=TRUE)
> • The data I sent to you should not be saved in the same directory where you
> save
> your R code!
> • The data is composed of two parts, cleaved (denoted by “cleaved”) and non
> cleaved (denoted by “noncleaved”). The first five lines of the data are
> shown
> below
> Peptide Label
> TQIMFETF cleaved
> GQVNYEEF cleaved
> KVFGRCEL noncleaved
> VFGRCELA noncleaved
> • to access to the ith peptide, you can use X$Peptide[i]
> • to access to the ith label, you can use X$Label[i]
>
> 4. detect the number of cleaved peptides and the number of non-cleaved
> peptides using
> • nrow(X)
>
> 5. define two matrices with initialised entries, one for positive peptides
> and one for neg-
> ative peptides
> • matrix(0,AA,mer),where AA is the number of amino acids, and mer is the
> number
> of residues detected from data using the nchar function
> • both matrices have the same size, the number of rows being equal to the
> number
> of amino acids and the number of columns being equal to the number of
> residues
> in peptides
> • name the columns of these two matrices using
> – c("N4","N3","N2","N1","C1","C2","C3","C4"),
>
> 6. use one three-loop structure to detect the frequency of amino acids in
> cleaved peptides
> and one three-loop structure to detect the frequency of amino acids in
> non-cleaved
> peptides. They should not be mixed in one three-loop structure. The best way
> to
> handle this is to use a function. The three-loop structure is exampled as
> below
> for(i in 1:num)#scanning data for all peptides, where num means the number
> of peptides
> {
> for(j in 1:mer)#scanning all residues in a peptide
> {
> for(k in 1:AA)#scanning 20 amino acids
> {
> #actions
> }
> }
> }
>
> 7. make sure that each frequency matrix needs to be converted to a
> percentage, i.e. each
> entry in the matrix is divided by the number of cleaved or non-cleaved
> peptides and
> multiplied by 100. This converted frequency is named as the normalised
> frequency.
>
> 8. detect the maximum height of the normalised frequency each residue in
> cleaved or
> non-cleaved peptides using
> height<-rep(0,mer)
> for(j in 1:mer)
> height[j]<-sum(round(X.frequency[,j]))
> max.height<-max(height)
> • Note that the height of each column in a graph (see the graph on 3)
> corresponds
> to the summation of 20 frequencies of 20 amino acids for a residue.
>
> 9. draw a blank plot using the maximum height
> • plot(c(0,10*mer),c(0,max.height),col="white", • • •)
> • in this blank plot, you can add graphics as discussed below
>
> 10. determine the x coordinate, but it is recommended to use i*10 as the
> x-coordinate
> where i indexes the residues. The x-coordinate represents columns in the
> graph shown
> in 3. If there are 8 residues in peptides, there are 8 columns.
>
> 11. determine the y coordinate, which is cumulative (see next item below).
> The y-
> coordinate represents rows in the graph shown in 3. There are always 20 rows
> for
> 20 amino acids. Note that the rows cannot be aligned because the frequency
> of an
> amino acid in a residue varies.
> 12. draw a rectangular based on the frequency of each residue and each amino
> acid
> • rect(x,y,x+10,y+round(X.frequency[k,j]),col=colmap[k]), where k indi-
> cates an amino acid and j indicates a residue
> • after drawing this rectangular, the y-coordinate “y” should be increased
> by round(X.frequency[k,j])
> • after one column is drawn for one residue, the x-coordinate “x” should be
> in-
> creased by 10
> 13. plot a text at the corresponding position using
> • text((x+5),(y+round(X.frequency[k,j])/2),substr(amino.acid,k,k))
> 14. place two drawings in one plot using the par function
> http://n4.nabble.com/file/n1457645/cleaved.jpg cleaved.jpg
> http://n4.nabble.com/file/n1457645/noncleaved.jpg noncleaved.jpg
> http://n4.nabble.com/file/n1457645/hiv.dat hiv.dat
>
>
> --
> View this message in context: http://n4.nabble.com/More-than-on-loop-tp1015851p1457645.html
> Sent from the R help mailing list archive at Nabble.com.
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem that you are trying to solve?



More information about the R-help mailing list