[R] convert data.frame to matrix -> NA factor problem

Joshua Wiley jwiley.psych at gmail.com
Wed Jul 14 18:00:22 CEST 2010


More importantly than just that they are factors, NA is actually a
level of X3.  If it was a factor column, but NA was not a level, than
in the conversion to numeric, it would not change into a 4, but it is
a level (in fact the 4th level), so it becomes a 4.  From ?factor here
is the recommended way of converting factors to numeric.  You see that
this converts to a matrix properly

samp.frame <- data.frame(a = 1:10,
 b = factor(c(rep(1:3, each = 3), NA), exclude = NULL))
str(samp.frame)
#Here NA becomes 4
samp.matrix <- data.matrix(samp.frame)
samp.matrix
#Convert the column in samp.frame first
samp.frame$b <- as.numeric(levels(samp.frame$b))[samp.frame$b]
str(samp.frame)
#Now convert to a matrix
samp.matrix <- data.matrix(samp.frame)
samp.matrix

I've never used the xlsx package, but an alternative to this process
would be to save the file from Excel as a text file and then read it
into R.  That way you could control whether things were read in as
factors or not.

Cheers,

Josh
On Wed, Jul 14, 2010 at 7:58 AM, syrvn <mentor_ at gmx.net> wrote:
>
> Hi,
>
> I used str() on my data set:
>
> $ X1            : num  1 1 0 1 1 1 1 1 1 1 ...
> $ X2            : num  0 1 0 2 1 2 0 2 2 0 ...
> $ X3            : Factor w/ 4 levels "0","1","2","NA": 2 1 3 1 1 1 1 1 1 3
> ...
> ....
>
>
> The difference to your str() output is that in your case NA columns are
> "num" columns
> and in my case they are Factors. That's prob. why it replaces the NAs with 4
> after
> applying data.matrix.
>
> I use the package xlsx to read the data in as an excel file.
>
> Cheers
> --
> View this message in context: http://r.789695.n4.nabble.com/convert-data-frame-to-matrix-NA-factor-problem-tp2288828p2288887.html
> Sent from the R help mailing list archive at Nabble.com.
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Joshua Wiley
Ph.D. Student, Health Psychology
University of California, Los Angeles
http://www.joshuawiley.com/



More information about the R-help mailing list