[R] Converting chr to num

Spencer Graves @pencer@gr@ve@ @end|ng |rom e||ect|vede|en@e@org
Mon Aug 20 07:39:20 CEST 2018


       Have you considered "Ecfun::asNumericChar" (and 
"Ecfun::asNumericDF")?


DF <- data.frame(variable = c("12.6% ", "30.9%", "61.4%", "1"))
Ecfun::asNumericChar(DF$variable)
[1] 0.126 0.309 0.614 1.000


       If you read the documentation including the examples, you will 
see that many of these issues and others are handled automatically in 
the way that I thought was the most sensible.  If you disagree, we can 
discuss other examples and perhaps modify the code for those functions.


       Spencer Graves


On 2018-08-20 00:26, Rui Barradas wrote:
> Hello,
>
> Inline.
>
> On 20/08/2018 01:08, Daniel Nordlund wrote:
>> See comment inline below:
>>
>> On 8/18/2018 10:06 PM, Rui Barradas wrote:
>>> Hello,
>>>
>>> It also works with class "factor":
>>>
>>> df <- data.frame(variable = c("12.6%", "30.9%", "61.4%"))
>>> class(df$variable)
>>> #[1] "factor"
>>>
>>> as.numeric(gsub(pattern = "%", "", df$variable))
>>> #[1] 12.6 30.9 61.4
>>>
>>>
>>> This is because sub() and gsub() return a character vector and the 
>>> instruction becomes an equivalent of what the help page ?factor 
>>> documents in section Warning:
>>>
>>> To transform a factor f to approximately its original numeric 
>>> values, as.numeric(levels(f))[f] is recommended and slightly more 
>>> efficient than as.numeric(as.character(f)).
>>>
>>>
>>> Also, I would still prefer
>>>
>>> as.numeric(sub(pattern = "%$","",df$variable))
>>> #[1] 12.6 30.9 61.4
>>>
>>> The pattern is more strict and there is no need to search&replace 
>>> multiple occurrences of '%'.
>>
>> The pattern is more strict, and that could cause the conversion to 
>> fail if the process that created the strings resulted in trailing 
>> spaces. 
>
> That's true, and I had thought of that but it wasn't in the OP's 
> problem description.
> The '$' could still be used with something like "%\\s*$":
>
> as.numeric(sub('%\\s*$', '', df$variable))
> #[1] 12.6 30.9 61.4
>
>
> Rui Barradas
>
>
>> Without the '$' the conversion succeeds.
>>
>> df <- data.frame(variable = c("12.6% ", "30.9%", "61.4%"))
>> as.numeric(sub('%$', '', df$variable))
>> [1]   NA 30.9 61.4
>> Warning message:
>> NAs introduced by coercion
>>
>>
>> <<<snip>>>
>>
>>
>> Dan
>>
>
> ---
> This email has been checked for viruses by AVG.
> https://www.avg.com
>
> ______________________________________________
> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide 
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.




More information about the R-help mailing list