[R] Character (1a, 1b) to numeric

Avi Gross @v|gro@@ @end|ng |rom ver|zon@net
Sat Jul 11 18:11:26 CEST 2020


There are many ways to do what is requested and some are fairly simple and
robust. A simple switch statement will do if you write some code but
consider using a function from some package for simple vectors or factors.

You could use the recode() or recode_factor() functions in package dplyr or
other similar functions elsewhere and type in the conversions like so:

library("dplyr")

xc <-  c("1", "1a", "1b", "1c", "2", "2a", "2b", "2c")

xn <- c(1, 1.3, 1.5, 1.7, 2, 2.3, 2.5, 2.7)

sample <- rep(xc, each=3)

recode(sample,
       "1" = 1,
       "1a" = 1.3,
       "1b" = 1.5,
       "1c" = 1.7,
       "2" = 2,
       "2a" =2.3,
       "2b" = 2.5,
       "2c" = 2.7)

That returns:

[1] 1.0 1.0 1.0 1.3 1.3 1.3 1.5 1.5 1.5 1.7 1.7 1.7 2.0 2.0 2.0 2.3 2.3 2.3
2.5 2.5 2.5 2.7 2.7 2.7

To use the original vectors would be a tad harder but doable perhaps using
some indirection.

As has been noted, you need to be careful in matching things to use the
entire item from beginning to end as matching  a substring can produce odd
results. If you add this code to the above, in a silly way, it works for a
more general case:

library(glue)

converted <- sample
for (i in 1:length(xc)) {
  converted <- sub(glue("^{xc[i]}$"), xn[i], converted)
}

result <- as.numeric(converted)

Returns:

> result
 [1] 1.0 1.0 1.0 1.3 1.3 1.3 1.5 1.5 1.5 1.7 1.7 1.7 2.0 2.0 2.0 2.3 2.3 2.3
2.5 2.5 2.5 2.7 2.7 2.7

Not necessarily efficient but it works. You could use something like
glue::glue() to create the arguments you want to use for something like
recode() in more complex cases and so on.

I think we have had enough solutions and methods posted but there are likely
many more as there is rarely only one way to do things in R.

-----Original Message-----
From: R-help <r-help-bounces using r-project.org> On Behalf Of Richard O'Keefe
Sent: Saturday, July 11, 2020 3:02 AM
To: Eric Berger <ericjberger using gmail.com>
Cc: Jean-Louis Abitbol <abitbol using sent.com>; R Project Help
<r-help using r-project.org>
Subject: Re: [R] Character (1a, 1b) to numeric

The string index approach works with any mapping from stage names to stage
numbers, not just regular ones.  For example, if we had "1" -> 1, "1a" ->
1.4, "1b" -> 1.6 "2" -> 2, "2a" -> 2.3, "2b" -> 2.7 the 'sub' version would
fail miserably while the string index version would just work.  The 'sub'
version would also not work terribly well if the mapping were "1" -> 1, "a1"
-> 1.3, "b1" -> 1.5, "c1" -> 1.7 and so on. The thing I like about the
indexing approach is that it uses a fundamental operation of the language
very directly.

Anyone using R would do well to *master* what indexing can do for you.


On Sat, 11 Jul 2020 at 17:16, Eric Berger <ericjberger using gmail.com> wrote:

> xn <- as.numeric(sub("c",".7",sub("b",".5",sub("a",".3",xc))))
>
>
> On Sat, Jul 11, 2020 at 5:09 AM Richard O'Keefe <raoknz using gmail.com> wrote:
>
>> This can be done very simply because vectors in R can have named 
>> elements, and can be indexed by strings.
>>
>> > stage <- c("1" = 1, "1a" = 1.3, "1b" = 1.5, "1c" = 1.7,
>> +            "2" = 2, "2a" = 2.3, "2b" = 2.5, "2c" = 2.7,
>> +            "3" = 3, "3a" = 3.3, "3b" = 3.5, "3c" = 3.7)
>>
>> > testdata <- rep(c("1", "1a", "1b", "1c",
>> +                   "2", "2a", "2b", "2c",
>> +                   "3", "3a", "3b", "3c"), times=c(1:6,6:1))
>>
>> > stage[testdata]
>>   1  1a  1a  1b  1b  1b  1c  1c  1c  1c   2   2   2   2   2  2a  2a  2a
>> 2a
>>  2a
>> 1.0 1.3 1.3 1.5 1.5 1.5 1.7 1.7 1.7 1.7 2.0 2.0 2.0 2.0 2.0 2.3 2.3 
>> 2.3
>> 2.3
>> 2.3
>>  2a  2b  2b  2b  2b  2b  2b  2c  2c  2c  2c  2c   3   3   3   3  3a  3a
>> 3a
>>  3b
>> 2.3 2.5 2.5 2.5 2.5 2.5 2.5 2.7 2.7 2.7 2.7 2.7 3.0 3.0 3.0 3.0 3.3 
>> 3.3
>> 3.3
>> 3.5
>>  3b  3c
>> 3.5 3.7
>>
>> On Sat, 11 Jul 2020 at 05:51, Jean-Louis Abitbol <abitbol using sent.com>
>> wrote:
>>
>> > Dear All
>> >
>> > I have a character vector,  representing histology stages, such as 
>> > for
>> > example:
>> > xc <-  c("1", "1a", "1b", "1c", "2", "2a", "2b", "2c")
>> >
>> > and this goes on to 3, 3a etc in various order for each patient. I 
>> > do
>> have
>> > of course a pre-established  classification available which does 
>> > change according to the histology criteria under assessment.
>> >
>> > I would want to convert xc, for plotting reasons, to a numeric 
>> > vector
>> such
>> > as
>> >
>> > xn <- c(1, 1.3, 1.5, 1.7, 2, 2.3, 2.5, 2.7)
>> >
>> > Unfortunately I have no clue on how to do that.
>> >
>> > Thanks for any help and apologies if I am missing the obvious way 
>> > to do
>> it.
>> >
>> > JL
>> > --
>> > Verif30042020
>> >
>> > ______________________________________________
>> > R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see 
>> > https://stat.ethz.ch/mailman/listinfo/r-help
>> > PLEASE do read the posting guide
>> > http://www.R-project.org/posting-guide.html
>> > and provide commented, minimal, self-contained, reproducible code.
>> >
>>
>>         [[alternative HTML version deleted]]
>>
>> ______________________________________________
>> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see 
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>

	[[alternative HTML version deleted]]

______________________________________________
R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list