[R] Getting a list of unique gene names from a list with semi-colons

Sat Jan 7 03:29:52 CET 2012

Sorry. - that should be a semi-colon below. 

Michael Weylandt

On Jan 6, 2012, at 8:17 PM, "R. Michael Weylandt <michael.weylandt at gmail.com>" <michael.weylandt at gmail.com> wrote:

> I think you can do this with something like this (untested):
> 
> unique(unlist(strsplit(XXX, ",")))
> 
> Michael
> 
> On Jan 6, 2012, at 8:05 PM, Kurinji Pandiyan <kurinji.pandiyan at gmail.com> wrote:
> 
>> Hello,
>> 
>> I have one column in my dataframe that has gene names of interest.
>> Unfortunately, due to the fact that some probes lie between two genes or
>> two transcripts of a gene, it looks something like this -
>> 
>> FAM81A  LOC283050;LOC283050;LOC283050;ZMIZ1  PINK1;PINK1  MRPL12;MRPL12
>> C1orf114  MMS19;UBTD1
>> I would like to know how to get a list with all the names with no
>> semi-colons and removing the replicates. I would like the end result to
>> look like -
>> 
>> FAM81A
>> LOC283050
>> ZMIZI
>> PINK1
>> MRPL12
>> C1orf114
>> MMS19
>> UBTD1
>> 
>> Thanks a lot for your help!
>> Kurinji
>> 
>>   [[alternative HTML version deleted]]
>> 
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.