[R] cannot turn some columns in a data frame into factors

Sam Steingold sds at podval.org
Thu May 11 21:15:33 CEST 2006


Thanks to everyone who took time to respond, both here on the list and
via private e-mail (I do read the list on gmane, so there is not reason
to CC me).

it turned out that R passes _structured_ arguments by value.

the solution I use now is:

  df[factors] = lapply(df[factors],factor)

  if (!all(sort(names(df)[sapply(df,is.factor)]) == sort(factors)))
    stop(paste("bad factors:",sort(names(df)[sapply(df,is.factor)]),"!=",
               sort(factors)))

it is based on a private e-mail reply by Phil Spector.

> * Sam Steingold <fqf at cbqiny.bet> [2006-05-11 12:09:26 -0400]:
>
> I have a data frame df and a list of names of columns that I want to
> turn into factors:
>
>   df.names <- attr(df,"names")
>   sapply(factors, function (name) {
>     pos <- match(name,df.names)
>     if (is.na(pos)) stop(paste(name,": no such column\n"))
>     df[[pos]] <- factor(df[[pos]])
>     cat(name,"(",pos,"):",is.factor(df[[pos]]),"\n")
>   })
>   cat("factors:",sapply(df,is.factor),"\n")
>
> the output is:
>
>
> Month ( 1 ): TRUE 
> factors: FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE 
>
>
> i.e., there is a column named "Month" (the 1st column), and it is indeed
> turned into a factor inside sapply(), but after that it is numerical
> again!
>
> what am I doing wrong?

-- 
Sam Steingold (http://www.podval.org/~sds) on Fedora Core release 5 (Bordeaux)
http://camera.org http://iris.org.il http://dhimmi.com
http://memri.org http://ffii.org http://jihadwatch.org http://pmw.org.il
PI seconds is a nanocentury




More information about the R-help mailing list