[R] reshape2's dcast() Adds NAs to Data Frame

arun smartpink111 at yahoo.com
Wed Aug 8 05:51:39 CEST 2012


HI,

It is hard to tell without the data.  But, a wild guess is that your data might have more levels per each variable and so the missing combinations end up as NA.
For example,
Try these two example datasets.
#####This will end up with NAs
md2  <-  structure(list(group = structure(c(1L, 1L, 1L, 2L, 2L, 3L, 3L, 
4L, 4L, 4L, 5L, 5L, 6L, 6L, 7L, 7L, 7L, 8L, 8L), .Label = c("X1", 
"X2", "X3", "X4", "X5", "X6", "X7", "X8"), class = "factor"), 
    tps = structure(c(7L, 12L, 14L, 4L, 8L, 9L, 16L, 6L, 7L, 
    11L, 6L, 15L, 10L, 13L, 3L, 4L, 5L, 1L, 2L), .Label = c("A", 
    "C", "D", "E", "G", "I", "L", "M", "N", "P", "Q", "R", "S", 
    "T", "V", "Y"), class = "factor"), sum = c(0.914913196595112, 
    0.0367565080432513, 0.0483302953616366, 0.982727803634948, 
    0.0172721963650521, 0.0483302953616366, 0.951669704638363, 
    0.89764100023006, 0.0850868034048879, 0.0172721963650521, 
    0.951669704638363, 0.0483302953616366, 0.963243491956749, 
    0.0367565080432513, 0.89764100023006, 0.0540287044083034, 
    0.0483302953616366, 0.982727803634948, 0.0172721963650521
    )), .Names = c("group", "tps", "sum"), row.names = c(NA, 
-19L), class = "data.frame")
dcast(md2,  group ~ tps , value.vars  = "sum")


##### with no NAs.

md4<-data.frame(group=c(rep("X1",3),rep("X2",3)),tps=c("L","R","P","L","R","P"),sum=rnorm(6,15))

dd  <-  dcast(md4, group~tps, value.var="sum")
  A.K.



----- Original Message -----
From: Rich Shepard <rshepard at appl-ecosys.com>
To: r-help at r-project.org
Cc: 
Sent: Tuesday, August 7, 2012 5:45 PM
Subject: [R] reshape2's dcast() Adds NAs to Data Frame

  I need to understand how and why dcast() adds NAs to a data frame that
contained no missing values.

  The database table of chemical concentrations has all missing values
removed because they cannot contribute to data analyses. The structure of
the R data frame of these data have no NA values, and neither does the data
frame resulting from applying the reshape2 melt() function to it. However,
the data frame produced by the dcast() function does contain NAs for all
chemicals. I assume this is because of the syntax I used:

chem.cast <- dcast(chem.melt, site + sampdate + era + ceneq1 + floor +
ceiling ~ param)

  How should I reshape the data frame from long to wide without adding these
spurious NAs?

Rich

______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.




More information about the R-help mailing list