[R] Why can't "apply" be used with "as.factor" on a data.frame ?

Don MacQueen macq at llnl.gov
Sun Mar 7 21:20:12 CET 2010


And just a small followup. To find out what class each column is, you wanted

>  lapply(a,class)
$x1
[1] "numeric"

$x2
[1] "factor"

$x3
[1] "factor"

With regard to your solution, and why it works, it is my 
understanding that data frames are in some sense actually lists, each 
column corresponding to one element in a list.

Hence, lapply() works column-wise on data frames.

Also for this reason it's pretty easy to convert back and forth 
between data frames and lists . Provided, of course, that each 
element of the list has an appropriate structure; see this example:

>  data.frame( list(a=1:2, b=3:4) )
   a b
1 1 3
2 2 4

>  data.frame( list(a=1:2, b=3:7) )
Error in data.frame(a = 1:2, b = 3:7, check.names = FALSE, 
stringsAsFactors = TRUE) :
   arguments imply differing number of rows: 2, 5


No doubt there are subtle details, but don't ask me to provide 
details on what exactly the "some sense" is!

-Don

At 12:07 PM +0200 3/7/10, Tal Galili wrote:
>Hi all,
>
>Let's say I have a data.frame and wants to turn each of it's columns into a
>factor.
>My instinct would be to use as.factor with apply. But this won't work, and
>result with a data.frame of characters.
>I found another solution for how to achieve this, but I would also like to
>understand - *WHY* does it work this way?
>
>Here is an example script:
>a <- data.frame(x1 = rnorm(100), x2 = sample(c("a","b"), 100, replace = T),
>x3 = factor(c(rep("a",50) , rep("b",50))))
>apply(a2, 2,class) # why is column 3 not a factor ?
>a[,3]  # since it IS a factor.
>a2 <- apply(a, 2,as.factor) # won't work - why not ?
>a2[,3]  # Why was this just turned into a character ???
># A solution
>a2 <- lapply(a, as.factor)
>a3 <- as.data.frame(a2)
>str(a3)
>
>
>Thanks,
>Tal
>
>
>
>----------------Contact
>Details:-------------------------------------------------------
>Contact me: Tal.Galili at gmail.com |  972-52-7275845
>Read me: www.*talgalili.com (Hebrew) | www.*biostatistics.co.il (Hebrew) |
>www.*r-statistics.com (English)
>----------------------------------------------------------------------------------------------
>
>	[[alternative HTML version deleted]]
>
>______________________________________________
>R-help at r-project.org mailing list
>https://*stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide http://*www.*R-project.org/posting-guide.html
>and provide commented, minimal, self-contained, reproducible code.


-- 
---------------------------------
Don MacQueen
Lawrence Livermore National Laboratory
Livermore, CA, USA
925-423-1062
macq at llnl.gov



More information about the R-help mailing list