[R] substituting level for NA in factor column

Bert Gunter gunter.berton at gene.com
Tue Jan 18 23:54:29 CET 2011


Well, (have you read "An Intro to R," which I think might have enabled
you to figure this out for yourself?)....

Convert the factor to character, use is.na() to substitute, convert
back to factor. e.g.

> z <- factor (c(1,2,3,NA))
>  z<- as.character(z)
> z[is.na(z)] <- "U"
> factor(z)
[1] 1 2 3 U
Levels: 1 2 3 U

HOWEVER, this is almost certainly a BAD IDEA, as you have now
essentially lost the ordering of the levels, which is important
information for modeling. In general, R handles NA's (perhaps not
always gracefully) and you should leave them as such and let R do its
job. A lot of effort by wise folks has been expended to make this
possible.

Cheers,
Bert

On Tue, Jan 18, 2011 at 2:25 PM,  <Kurt_Helf at nps.gov> wrote:
> Greetings
>     I have a bunch of NAs in a column of categorical variables designating
> the size classes (e.g., smallest to largest: 1,2,3,4) of cave crickets.
> I'd like to substitute "U" (for unknown) for the NAs.  Can anyone give me
> an idea how to do this?  Thanks in advance.
> Cheers
> Kurt
>
> ***************************************************************
> Kurt Lewis Helf, Ph.D.
> Ecologist
> EEO Counselor
> National Park Service
> Cumberland Piedmont Network
> P.O. Box 8
> Mammoth Cave, KY 42259
> Ph: 270-758-2163
> Lab: 270-758-2151
> Fax: 270-758-2609
> ****************************************************************
> Science, in constantly seeking real explanations, reveals the true majesty
> of our world in all its complexity.
> -Richard Dawkins
>
> The scientific tradition is distinguished from the pre-scientific tradition
> in having two layers.  Like the latter it passes on its theories but it
> also passes on a critical attitude towards them.  The theories are passed
> on not as dogmas but rather with the challenge to discuss them and improve
> upon them.
> -Karl Popper
>
> ...consider yourself a guest in the home of other creatures as significant
> as yourself.
> -Wayside at Wilderness Threshold in McKittrick Canyon, Guadalupe Mountains
> National Park, TX
>
> Cumberland Piedmont Network (CUPN) Homepage:
> http://tiny.cc/e7cdx
>
> CUPN Forest Pest Monitoring Website:
> http://bit.ly/9rhUZQ
>
> CUPN Cave Cricket Monitoring Website:
> http://tiny.cc/ntcql
>
> CUPN Cave Aquatic Biota Monitoring Website:
> http://tiny.cc/n2z1o
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Bert Gunter
Genentech Nonclinical Biostatistics



More information about the R-help mailing list