[R] Converting unique strings to unique numbers

Hervé Pagès hpages at fredhutch.org
Fri May 29 20:16:57 CEST 2015


Hi Kate,

I found that matching the character vector to itself is a very
effective way to do this:

   x <- c("a", "bunch", "of", "strings", "whose", "exact", "content",
          "is", "of", "little", "interest")
   ids <- match(x, x)
   ids
   # [1]  1  2  3  4  5  6  7  8  3 10 11

By using this trick, many manipulations on character vectors can
be replaced by manipulations on integer vectors, which are sometimes
way more efficient.

Cheers,
H.


On 05/29/2015 09:58 AM, Kate Ignatius wrote:
> I have a pedigree file as so:
>
> X0001 BYX859      0      0  2  1 BYX859
> X0001 BYX894      0      0  1  1 BYX894
> X0001 BYX862 BYX894 BYX859  2  2 BYX862
> X0001 BYX863 BYX894 BYX859  2  2 BYX863
> X0001 BYX864 BYX894 BYX859  2  2 BYX864
> X0001 BYX865 BYX894 BYX859  2  2 BYX865
>
> And I was hoping to change all unique string values to numbers.
>
> That is:
>
> BYX859 = 1
> BYX894 = 2
> BYX862 = 3
> BYX863 = 4
> BYX864 = 5
> BYX865 = 6
>
> But only in columns 2 - 4.  Essentially I would like the data to look like this:
>
> X0001 1 0 0  2  1 BYX859
> X0001 2 0 0  1  1 BYX894
> X0001 3 2 1  2  2 BYX862
> X0001 4 2 1  2  2 BYX863
> X0001 5 2 1  2  2 BYX864
> X0001 6 2 1  2  2 BYX865
>
> Is this possible with factors?
>
> Thanks!
>
> K.
>
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

-- 
Hervé Pagès

Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024

E-mail: hpages at fredhutch.org
Phone:  (206) 667-5791
Fax:    (206) 667-1319



More information about the R-help mailing list