[R] caculate the frequencies of the Amino Acids

David Winsemius dwinsemius at comcast.net
Sun Jan 3 06:52:07 CET 2010


On Jan 3, 2010, at 12:28 AM, che wrote:

>
> Thanks very much the code is working perfectly, but I hope guys that  
> you can
> help me to do the same thing but by using the loop structure, i want  
> to know
> if i am doing right, i want to use the loop structure to scan each  
> sequence
> from the file sequence.txt (the file is attached) to get the  
> frequency for
> each Amino Acid, and i wrote the following code so far, and i  
> stopped, got
> confused, specially that i am a very beginner in R
> http://n4.nabble.com/file/n997581/sequence.txt sequence.txt :
> x<-read.table("sequence.txt",header=FALSE)
> AA<- 
> c 
> ('A 
> ','C 
> ','D 
> ','E','F','G','H','I','K','L','M','N','P','Q','R','S','T','V','W','Y')
>
> test<-nchar(as.character(x$V1[i]))
> frequency<-function(X)
> {
> y<-rep(0,20)

I earlier pointed out that such a structure would be inadequate to  
hold the tabulation of more than one sequence. You probably need a  
matrix of "width" = 20 and "depth" = the number of your sequences.

> for(j in 1:test){
> for(i in 1:nrow(x)){
> 	res<-which(AA==substr(x$V1[i],j,j))
> 	y[res]=y[res]+1
... and here you will need to index y[ , ] with both the proper row  
and column.
> 	}
> 	}
> return(y)
> }
> So how to fix this code, how to give the life for the “i” and the  
> “j” in
> order to initiate the indexing..... Sorry for bothering you guys.

-- 
David.

>
>
> che wrote:
>>
>> may some one please help me to sort this out, i am trying to writ a  
>> R code
>> for calculating the frequencies of the amino acids in 9 different
>> sequences, i want the code to read the sequence from external text  
>> file, i
>> used the following code to do so:
>> x<-read.table("sequence.txt",header=FALSE)
>>
>> then i defined an array for 20 amino acids as following:
>> AA<- 
>> c 
>> ('A 
>> ','C 
>> ','D 
>> ','E 
>> ','F','G','H','I','K','L','M','N','P','Q','R','S','T','V','W','Y')
>> i am using the following code to calculate the frequencies:
>>
>> frequency<-function(X)
>> {
>> y<-rep(0,20)
>> for(j in 1:nchar(as.character(x$V1[i]))){
>> for(i in 1:9){
>>
>> 	res<-which(AA==substr(x$V1[i],j,j))
>> 	y[res]=y[res]+1
>> 	}
>> 	}
>> return(y)
>> }
>>
>> but this code actually is not working, it reads only one sequence,  
>> i dont
>> know why the loop is not working for the "i", which suppose to read  
>> the
>> nine rows of the file sequence.txt. the sequence.txt file is  
>> attached to
>> this message.
>>
>> cheers
>> http://n4.nabble.com/file/n997072/sequence.txt sequence.txt
>>
>
> -- 
> View this message in context: http://n4.nabble.com/caculate-the-frequencies-of-the-Amino-Acids-tp997072p997581.html
> Sent from the R help mailing list archive at Nabble.com.
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

David Winsemius, MD
Heritage Laboratories
West Hartford, CT



More information about the R-help mailing list