[R] How to convert a factor column into a numeric one?

Dennis Murphy djmuser at gmail.com
Sun Jun 5 06:49:59 CEST 2011


Hi:

Try this:

> dd <- data.frame(a = factor(rep(1:5, each = 4)),
+                  b = factor(rep(rep(1:2, each = 2), 5)),
+                  y = rnorm(20))
> str(dd)
'data.frame':   20 obs. of  3 variables:
 $ a: Factor w/ 5 levels "1","2","3","4",..: 1 1 1 1 2 2 2 2 3 3 ...
 $ b: Factor w/ 2 levels "1","2": 1 1 2 2 1 1 2 2 1 1 ...
 $ y: num  0.6396 1.467 1.8403 -0.0915 0.2711 ...
> de <- within(dd, {
+          a <- as.numeric(as.character(a))
+          b <- as.numeric(as.character(b))
+        } )
> str(de)
'data.frame':   20 obs. of  3 variables:
 $ a: num  1 1 1 1 2 2 2 2 3 3 ...
 $ b: num  1 1 2 2 1 1 2 2 1 1 ...
 $ y: num  0.6396 1.467 1.8403 -0.0915 0.2711 ...


HTH,
Dennis

On Sat, Jun 4, 2011 at 9:31 PM, Robert A. LaBudde <ral at lcfltd.com> wrote:
> I have a data frame:
>
>> head(df)
>  Time Temp Conc Repl    Log10
> 1    0  -20    H    1 6.406547
> 2    2  -20    H    1 5.738683
> 3    7  -20    H    1 5.796394
> 4   14  -20    H    1 4.413691
> 5    0    4    H    1 6.406547
> 7    7    4    H    1 5.705433
>> str(df)
> 'data.frame':   177 obs. of  5 variables:
>  $ Time : Factor w/ 4 levels "0","2","7","14": 1 2 3 4 1 3 4 1 3 4 ...
>  $ Temp : Factor w/ 4 levels "-20","4","25",..: 1 1 1 1 2 2 2 3 3 3 ...
>  $ Conc : Factor w/ 3 levels "H","L","M": 1 1 1 1 1 1 1 1 1 1 ...
>  $ Repl : Factor w/ 5 levels "1","2","3","4",..: 1 1 1 1 1 1 1 1 1 1 ...
>  $ Log10: num  6.41 5.74 5.8 4.41 6.41 ...
>> levels(df$Temp)
> [1] "-20" "4"   "25"  "45"
>> levels(df$Time)
> [1] "0"  "2"  "7"  "14"
>
> As you can see, "Time" and "Temp" are currently factors, not numeric.
>
> I would like to change these columns into numerical ones.
>
> df$Time<- as.numeric(df$Time)
>
> doesn't work, as it changes to the factor level indices (1,2,3,4) instead of
> the values (0,2,7,14).
>
> There must be a direct way of doing this in R.
>
> I tried recode() in 'car':
>
>> df$Temp<- recode(df$Temp, '1=-20;2=25;3=4;4=45',as.factor.result=FALSE)
>> head(df)
>  Time Temp Conc Repl     Freq
> 1    0  -20    H    1 6.406547
> 2    2  -20    H    1 5.738683
> 3    7  -20    H    1 5.796394
> 4   14  -20    H    1 4.413691
> 5    0   45    H    1 6.406547
> 7    7   45    H    1 5.705433
>
> but note that the values for 'Temp' in rows 5 and 7 are 45 and not 4, as
> expected, although the result is numeric. The same happens if I use the
> order given by levels(df$Temp) instead of the sort order in the recode() 2nd
> argument.
>
> Any hints?
> ================================================================
> Robert A. LaBudde, PhD, PAS, Dpl. ACAFS  e-mail: ral at lcfltd.com
> Least Cost Formulations, Ltd.            URL: http://lcfltd.com/
> 824 Timberlake Drive                     Tel: 757-467-0954
> Virginia Beach, VA 23464-3239            Fax: 757-467-2947
>
> "Vere scire est per causas scire"
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



More information about the R-help mailing list