[R] convert factor p000345 to numeric

Marc Schwartz marc_schwartz at comcast.net
Wed Nov 8 23:44:41 CET 2006


On Wed, 2006-11-08 at 23:16 +0100, Marco Boks wrote:
> Dear All,

> I am lost about the following. I have got a large dataframe (largeset)
> with in the first column identification numbers as factors

> largeset$ID
> 
> p000345
> 
> p000356
> 
> p000569
> 
> etc

> in order to use them to merge with another dataframe with numerical
> values (000345, 000356) I want to convert them to numerical.

> >as.numeric(as.character(largeset$ID)) gives NA's

> >as.numeric(strsplit(as.character(largeset[,1]), "p")) also fails:

> Error in as.double.default(strsplit(as.character(largeset[, 1]),
> "p")) : 
>         unimplemented type 'character' in 'asReal'

> Any suggestions would be very appreciated
> 
> Marco


Two approaches, depending upon whether you need actual numeric values
(which will not retain the leading zeroes) or simply strip the 'p' to
retain the leading zeroes. Also presuming that the leading 'p' is the
only non-numeric character here.

# Presuming that ID is a factor
> ID
[1] p000345 p000356 p000569
Levels: p000345 p000356 p000569


# Retain leading zeroes as a character vector
> sub("p", "", ID)
[1] "000345" "000356" "000569"


# Convert to a numeric vector
> as.numeric(sub("p", "", ID))
[1] 345 356 569


See ?sub for more information.

HTH,

Marc Schwartz



More information about the R-help mailing list