[R] Recoding categorical gender variable into numeric factors

David L Carlson dcarlson at tamu.edu
Wed Sep 5 22:20:15 CEST 2012


I can't replicate your problem. I created a data set with "Male" and
"Female" since that is what you indicate, but your commands use "M" and "F"
which is different. When I use "Male" and "Female" the recoding is just as
expected, but you don't even need to do this. You probably already have a
factor since R routinely turns character fields into factors:

> data <- data.frame(sex=c(rep("Male", 5), rep("Female", 5)))
> data
      sex
1    Male
2    Male
3    Male
4    Male
5    Male
6  Female
7  Female
8  Female
9  Female
10 Female
> str(data)
'data.frame':   10 obs. of  1 variable:
 $ sex: Factor w/ 2 levels "Female","Male": 2 2 2 2 2 1 1 1 1 1

So data$sex is a Factor with two levels Female=1 and Male=2. If the result
of str(data) looks like this, you have a character array (chr):

> str(data)
'data.frame':   10 obs. of  1 variable:
 $ sex: chr  "Male" "Male" "Male" "Male" ...

If you want to convert a character array to a factor just use the command:

data$sex <- factor(data$sex)

By default, R orders the character strings alphabetically before converting
to factors so "Female" becomes 1 and "Male" becomes 2.

----------------------------------------------
David L Carlson
Associate Professor of Anthropology
Texas A&M University
College Station, TX 77843-4352

> -----Original Message-----
> From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-
> project.org] On Behalf Of Conradsb
> Sent: Wednesday, September 05, 2012 2:14 PM
> To: r-help at r-project.org
> Subject: [R] Recoding categorical gender variable into numeric factors
> 
> I currently have a data set in which gender is inputed as "Male" and
> "Female"
> , and I'm trying to convert this into "1" and "0".
> 
> I found a website which reccomended using two commands:
> 
> data$scode[data$sex=="M"] <- "1"
> data$scode[data$sex=="F"] <- "2"
> 
> to convert to numbers, and:
> 
> data$scode <- factor(data$scode)
> 
> to convert this variable to a factor.
> 
> 
> 
> My issue is that, after I use the first command, *only* the female
> values
> get converted to a number. I am left with a column filled with 2's and
> blank
> spaces. Instead of typing both lines of the first command, I copy and
> pasted
> the first line and changed the letter representing gender. I also made
> sure
> that both letters were exactly as they appear in the dataset.
> 
> My questions are: is there any visible issue with my syntax, and are
> there
> any other methods to accomplish this?
> 
> I'm also very new to R, so complex syntax is beyond me.
> 
> Conrad Baldner
> 
> 
> 
> --
> View this message in context: http://r.789695.n4.nabble.com/Recoding-
> categorical-gender-variable-into-numeric-factors-tp4642316.html
> Sent from the R help mailing list archive at Nabble.com.
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-
> guide.html
> and provide commented, minimal, self-contained, reproducible code.




More information about the R-help mailing list