[R] Looping through multiple sub elements of a list to compare to multiple components of a vector

debra ragland ragland.debra at yahoo.com
Wed Dec 2 16:39:50 CET 2015


I think I am making this problem harder than it has to be and so I keep getting stuck on what might be a trivial problem. 
I have used the seqinr package to load a protein sequence alignment containing 15 protein sequences;
    > library(seqinr)    > x = read.alignment("proteins.fasta",format="fasta",forceToLower=FALSE)This automatically loads in a list of 4 elements including the sequences and other information.
I store the sequences to a new list;
   > mylist = x$seqwhich returns a character vector of 15 strings.
I have found that if I split the long character strings into individual characters it is easy to use lapply to loop over this list. So I use strsplit;
    >list.2 = strsplit(mylist, split = NULL)
>From this list I can determine which proteins have changes at certain positions by using;
    >lapply(list.2, "[", 10) == "L"This returns a logical T/F vector for those elements of the list that do/do not the letter L at position 10. 
Because each of the protein sequences contains 99amino acids, I want to automate this process so that I do not have to compare/contrast positions 1 x 1. Most of the changes occur between positions/letters 10-95. I have a standard character vector that I wish to use for comparison when looping through the list. 
Should I perhaps combine all --  the standard "letter"/aa vector, the list of protein sequences -- into one list? Or is it better to leave them separate for this comparison? I'm not sure what the output should be as I need to use it for another statistical test. Would a list of logical vectors be the most sufficient output to return? 
	[[alternative HTML version deleted]]



More information about the R-help mailing list