[R] transforming character categories

Gabor Grothendieck ggrothendieck at gmail.com
Sun Jun 28 13:20:17 CEST 2009


If you only want to convert them to unique numbers then

as.numeric(factor(x))

will do that.

On Sun, Jun 28, 2009 at 7:00 AM, Gregor Povh<gregorpovh at yahoo.de> wrote:
> Thanks Gabor, but in my case not every value ist actually encoded within
> the character string.  Vor example, I have an answer category, which is
> "more than one Mio. $".  (not in the column "income"...).
>
> I have the feeling, that there must be an another, straightforward way
> or function for transformation of levels / categories, but I just cannot
> make it work.
>
> - greg
>
>> Try this.  It matches the first numeric string on
>> each line applying as.numeric to it and then using
>> c to simplify the resulting list to a numeric vector.
>>
>>
>>> x <- c("from 1000$ to 2000$", "from 2000$  to 3000$", "more than 3000$",
>>>
>> + "from 1000$ to 2000$", "from 1000$ to 2000$")
>>
>>
>>> library(gsubfn)
>>> strapply(x, "([0-9]+).*", as.numeric, simplify = c)
>>>
>> [1] 1000 2000 3000 1000 1000
>>
>> See the gsubfn home page for more:
>> http://gsubfn.googlecode.com
>>
>> On Sun, Jun 28, 2009 at 4:25 AM, Gregor Povh<gregorpovh at yahoo.de> wrote:
>>
>>> Dear R users,
>>>
>>> apologies for this quite simple question.  I've tried serverall approaches,
>>> however, could not generate the desired result.
>>>
>>> I have a large data frame, which has several cathegories encoded as
>>> character strings, for example.
>>>
>>> Name, income, gender, ...
>>> ...  "from 1000$ to 2000$"  ...
>>> ...  "from 2000$  to 3000$" ...
>>> ...  "more than 3000$"        ...
>>> ...  "from 1000$ to 2000$"  ...
>>> ...  "from 1000$ to 2000$"  ...
>>>
>>>
>>> How can I transform this column into numeric values for the categories, for
>>> example in somethins like this:
>>> ... 1000 ...
>>> ... 2000 ...
>>> ... 3000 ...
>>> ... 1000 ...
>>> ... 1000 ...
>>>
>>> ______________________________________________
>>> R-help at r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>>>
>>
>>
>
>
>        [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>




More information about the R-help mailing list