[R] how to convert a set of strings to a list of unique numeric id?

Gabor Grothendieck ggrothendieck at gmail.com
Mon Jun 21 01:58:24 CEST 2010


Use a variable of class "factor"

> s <- c("ABCDDDD", "ACCDEDF", "ACCGEDF", "ACCGEGF", "ACCDEDF", "ACCGEGF")
> fs <- factor(s)
> levels(fs)
[1] "ABCDDDD" "ACCDEDF" "ACCGEDF" "ACCGEGF"
> unclass(fs)
[1] 1 2 3 4 2 4
attr(,"levels")
[1] "ABCDDDD" "ACCDEDF" "ACCGEDF" "ACCGEGF"


On Sun, Jun 20, 2010 at 7:46 PM, G FANG <fanggangsw at gmail.com> wrote:
> Hi,
>
> I have been a matlab user and is learning R.
>
> I want to convert a large list of strings to a list of unique numeric
> ids to reduce storage space.
>
> For example,
>
> there is a string list (there are duplicates)
>
> ABCDDDD
> ACCDEDF
> ACCGEDF
> ACCGEGF
> .....
> ACCDEDF
> ACCGEGF
>
> and I want to have a corresponding numeric id list
>
> 1
> 2
> 3
> 4
> ....
> 2
> 4
>
> In matlab, the 'unique' function can do this in addition to give the
> unique set, but in R, 'unique' only gives the unique set
>
>
> Please advice me on this.
>
> Thanks,
>
> Gang
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



More information about the R-help mailing list