[R] Parsing protein sequences

S Peri biocperi at yahoo.com
Thu Jul 8 15:56:01 CEST 2004


Dear All,
  I have two files with peptide sequences. These two
have peptide sequences(obtained from tryptic and
semi-tryptic digestion using Mass spec analysis). 
There are two columns :peptide sequence and protein
name.
 
File 2 has both tryptic and semi-tryptic peptides and
File 1 has only semi-tryptic peptides. Is there a way
out that I can filter semi-tryptic peptides from
typtic ones.  There are ~30,000 peptides in each file.

Please suggest if this can be done in R. I want to do
further analysis involving some statistics using R.

Thank you in advance.
SP


File 1 (is a comma seperated file):
pepseq	 proteinname
FENGAFT	NP_065081.1
SLLEDIR	NP_062571.1
VCCEGMLIQ	NP_064583
NWGLSVYADKPETTK	NP_000598
MLAFDVNDEK	NP_000598


File 2 (comma seperated file):
SLLEDIR	NP_062571
TYMLAFDVNDEK	NP_000598
ASSLSESSPPK	NP_057441
LSIVVSLGTGR	NP_003551




More information about the R-help mailing list