[R] How to recode a factor level (within the list)?

Prof Brian Ripley ripley at stats.ox.ac.uk
Sun Dec 2 19:59:36 CET 2007


On Sun, 2 Dec 2007, Lauri Nikkinen wrote:

> #Dear R-users,
> #I have a data.frame like this:
>
> y1 <- rnorm(10) + 6.8
> y2 <- rnorm(10) + (1:10*1.7 + 1)
> y3 <- rnorm(10) + (1:10*6.7 + 3.7)
> y <- c(y1,y2,y3)
> x <- rep(1:3,10)
> f <- gl(2,15, labels=paste("lev", 1:2, sep=""))
> g <- seq(as.Date("2000/1/1"), by="day", length=30)
> DF <- data.frame(x=x,y=y, f=f, g=g)
> DF$g[DF$x == 1] <- NA
> DF$x[3:6] <- NA
> DF$wdays <- weekdays(DF$g)
>
> DF
>
> #Frequences
> g <- lapply(DF, function(x) as.data.frame(table(format(x))))
> g
>
> #NA:s are now part of factor levels. How to recode NA:s into e.g. "missing"?

Not so:

> sapply(DF, class)
           x           y           f           g       wdays
   "integer"   "numeric"    "factor"      "Date" "character"

and DF$f does not have any NA levels.

The place you may think you have got NAs is in format(wdays):
they are not NA nor "NA" but "NA     ".  I am not sure what exactly you 
want (NA is not appearing in the tables: see the 'exclude' argument),
but perhaps

lapply(DF, function(x) {
   if(is.character(x)) x[is.na(x)] <- "missing"
   as.data.frame(table(format(x)))
})

or

lapply(DF, function(x) {
   z <- table(format(x))
   names(z)[grep("^NA", names(z))] <- "missing"
   as.data.frame(z)
})

or

lapply(DF, function(x) {
   z <- table(x, exclude=character(0))
   names(z)[is.na(names(z))] <- "missing"
   as.data.frame(z)
})

?

-- 
Brian D. Ripley,                  ripley at stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595



More information about the R-help mailing list