[R] Any method to speed up this problem?

njhuang86 njhuang86 at yahoo.com
Thu Jun 18 16:28:44 CEST 2009


Hi all,

Suppose I have a vector like this:

[1] "STAT1"  "STAT1"  "STAT1"  "STAT1"  "GAPDH"  "GAPDH"  "GAPDH"  "ACTB"  
"ACTB"  
[10] "ACTB"   "DDR1"   "RFC2"   "HSPA6"  "PAX8"   "GUCA1A" "UBE1L"  "THRA"  
"PTPN21"
[19] "CCL5"   "CYP2E1"  "STAT1"  "THRA"  "PAX8"

I would like to produce a vector such that it has the same length as the one
above but it tells me where the duplicates are. So essentially, if I could
represent each gene symbol as a specific number, and have the duplicates be
the same number, that would be ideal. Right now, I'm using the unique
command along with two nested for loops to do the job... But it's really
taking too long... Any suggestions would be greatly appreciated. Thank you!
-- 
View this message in context: http://www.nabble.com/Any-method-to-speed-up-this-problem--tp24094164p24094164.html
Sent from the R help mailing list archive at Nabble.com.




More information about the R-help mailing list