[R] recoding responses in a numeric variable

Jeff Newmiller jdnewmil at dcn.davis.ca.us
Sun Jan 8 06:04:38 CET 2017


Please read the Posting Guide mentioned at the bottom of this and every 
message. In particular, send your email in plain text format so we get to 
see what you saw (the mailing list strips out HTML formatting in most 
cases). Also please work to make your examples reproducible... e.g. give 
all steps necessary to reproduce your output or error... otherwise 
we get to guess what you were doing wrong and if we guess wrong then our 
help is wasted. The code below should be reproducible for your benefit and 
for anyone else who reads this.

#### code follows
# make believe data as though it was in a file
inputdata <-
"vn35
no entry
no entry
don't know
don't know
don't know
a lot of fear
a lot of fear
a lot of fear
a lot of fear
big fear
big fear
big fear
big fear
big fear
medium fear
medium fear
medium fear
medium fear
medium fear
medium fear
little fear
little fear
little fear
little fear
little fear
little fear
little fear
no fear at all
no fear at all
no fear at all
no fear at all
no fear at all
no fear at all
no fear at all
no fear at all
"

# I am going to guess that you did something kind of like

gles_reduced <- read.csv( text=inputdata )

# before you did the steps you gave us:
fear <- gles_reduced$vn35
levels(fear)  # this doesn't change any of your data
table(fear, as.numeric(fear), exclude=NULL) # neither does this

# This now has a new level <NA> represented by a numeric value NA, and
# it is really not very useful to have both the value and the level be
# NA.

# In my opinion, the problem started when you let R automatically
# create a factor based on default settings. Lets try that again
# the right way:

# DONT let R automatically create a factor column
gles_reduced <- read.csv( text=inputdata, stringsAsFactors = FALSE )

fear <- gles_reduced$vn35
# now fear is a vector of character strings
# replace semantic unknowns with NA
fear[ fear %in% c( "no entry", "don't know" ) ] <- NA
# define the levels in the order you want them from small to large
fearlvls <- c( "no fear at all"
              , "little fear"
              , "medium fear"
              , "big fear"
              , "a lot of fear"
              )
# explicitly create the factor with comparability
fear <- ordered( fear, levels=fearlvls )
table(fear, as.numeric( fear ) )
sum( is.na( fear ) )
which( "big fear" < fear ) # indexes of the ones that have
                            # a lot of fear
fear[ which( "big fear" < fear ) ] # see them
#### end of code

Note that the levels go from 1 to 5, not 0 to 4, but factors don't work 
with zeroes.  Fortunately all the stats functions in R know this so you 
are better off not fighting the convention. If you absolutely must, then 
you need to deal with it in an integer or numeric vector:

fearnums <- as.integer( fear ) - 1L

On Sat, 7 Jan 2017, Licia Biotti wrote:

> Hello,
>
> I am working with a dataset in R studio, and I have created a numeric
> variable which I have called fear by using a factor variable (called vn35).
> Here is the piece of code:
> fear<-gles_reduced$vn35
> levels(fear)
> table(fear, as.numeric(fear), exclude=NULL)
>
> Then I have coded the levels "don't know" and "not specified" as NA
> fear[fear=="not specified"]<-NA
> fear[fear=="don't know"]<-NA
>
> This is how the table looks like:
>
> fear                          3    4    5    6    7 <NA>
>  no entry                 0    0    0    0    0    0
>  don't know             0    0    0    0    0    0
>  a lot of fear           412    0    0    0    0    0
>  big fear                   0  883    0    0    0    0
>  medium fear           0    0  1350    0    0    0
>  little fear                 0    0    0  920    0    0
>  no fear at all           0    0    0    0  305    0
>  <NA>                      0    0    0    0    0    41
>
> I would like to code the remaining answers (a lot of fear, big fear, medium
> fear, little fear and no fear at all) with values from 0 to 4 (so that
> greater values indicate great concern)
> I tried this piece of code:
> fear[fear=="big fear"]<-1
> But it is not working,
> could you please help me?
> Thanks,
>
> 	[[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

---------------------------------------------------------------------------
Jeff Newmiller                        The     .....       .....  Go Live...
DCN:<jdnewmil at dcn.davis.ca.us>        Basics: ##.#.       ##.#.  Live Go...
                                       Live:   OO#.. Dead: OO#..  Playing
Research Engineer (Solar/Batteries            O.O#.       #.O#.  with
/Software/Embedded Controllers)               .OO#.       .OO#.  rocks...1k



More information about the R-help mailing list