[R] Writing to a file

Petr PIKAL petr.pikal at precheza.cz
Tue Feb 7 08:57:30 CET 2012


Hi
> 
> Honestly thank you for the prompt responding
> and you are right I will tellyou what I want to do 
> and not the way ..since I dont know much from R
> 
> 
> I have a txt with Proteins
> 
> "Prot_10035"   "Func_0005874"   "Func_0016787"   "Func_0003774" 
"Func_0006898"
> "Func_0005856"   "Func_0005525"   "Func_0005737"   "Func_0003924" 
"Func_0005515"
> "Func_0000166"  
> "Prot_10036"   "Func_0005739"   "Func_0003735"   "Func_0006412" 
"Func_0005763"
> "Func_0005840"  
> "Prot_10037"   "Func_0005739"   "Func_0005515"  
> "Prot_10039"   "Func_0005576"   "Func_0009615"   "Func_0050832" 
"Func_0005615"
> "Func_0006955"   "Func_0042742"   "Func_0031640"   "Func_0006935"   
> "Prot_1004"   "Func_0046872"   "Func_0003887"   "Func_0003684" 
"Func_0016740"
> "Func_0006281"   "Func_0006260"   "Func_0016779"   "Func_0005634"   
> "Prot_10040"   "Func_0005886"   "Func_0046488"   "Func_0016301" 
"Func_0007409"
> "Func_0005524"   "Func_0016740"   "Func_0016308"   "Func_0000166" 
> 
> which is 8527 lines and 145 columns (not all the proteins have the same
> number of proteins)
            functions?

First of all you need to read this file into R properly. I would try 
readLines with some further polishing to feed list structure with protein 
names as labels for each part of a list. After that some cycle/lapply 
checking with regular expression could be a way to populate a data frame 
with protein names in first column and score in the second. After that you 
can compare such score with other values in another data frame.

However without an example you hardly get detailed help.

Regards
Petr


> What I want is to predict whether those proteins are related to cancer 
or
> not 
> depending on whether they have some functions. I found that there are 3
> functions very often related to cancer
> and in case a protein has 2/3 or 3/3 to "label" it (somehow-maybe adding 
an
> extra column) as cancer related
> The names of the Proteins are always in the 1st column but the names of 
the
> functions can be at any of the next columns 
> 
> So what I did is to use this loop, but I cant write properly the way I 
want
> it to print the results so to use them again
> (I need to know the name of the proteins having the functions in a 
column so
> as next step to compare it with another file
> -test data set- and conclude to true positive, false positive, true
> negative, false negative
> 
> It cant be as hard as I see it :):) 
> 
> --
> View this message in context: 
http://r.789695.n4.nabble.com/Writing-to-a-
> file-tp3070617p4363940.html
> Sent from the R help mailing list archive at Nabble.com.
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list