[R] Omitting NA's using dcast (reshape2 package)

David L Carlson dcarlson at tamu.edu
Mon Jun 22 19:25:05 CEST 2015


You could apply na.omit() to just the columns you are using:

>  dcast(na.omit(df[,1:2]), v1 ~ v2, length, margins = TRUE)
Using v2 as value column: use value.var to override.
     v1 X Y Z (all)
1     A 1 2 2     5
2     B 1 2 1     4
3 (all) 2 4 3     9

-------------------------------------
David L Carlson
Department of Anthropology
Texas A&M University
College Station, TX 77840-4352

-----Original Message-----
From: R-help [mailto:r-help-bounces at r-project.org] On Behalf Of Michael.Laviolette at dhhs.state.nh.us
Sent: Monday, June 22, 2015 8:47 AM
To: r-help at r-project.org
Subject: [R] Omitting NA's using dcast (reshape2 package)


I'm using the "dcast" function from Hadley's "reshape2" package to do some
tabulations. I can't get it to exclude NA's in the variables being
tabulated. Here's a simple example.

v1 <- c(rep("A", 5), rep("B", 5), NA)
v2 <- c("X", "Y", "Y", "Z", "Z", "X", "Y", "Y", "Z", NA, "Z")
v3 <- c(rep("a", 4), "c", "a", "b", NA, "c", "b", "c")
df <- data.frame(v1, v2, v3)
rm(v1, v2, v3)

library(reshape2)
dcast(df, v1 ~ v2, length, margins = TRUE)

#      v1 X Y Z NA (all)
# 1     A 1 2 2  0     5
# 2     B 1 2 1  1     5
# 3  <NA> 0 0 1  0     1
# 4 (all) 2 4 4  1    11
# "drop" argument has no effect
# na.omit will skip all records with any missing value

What I want is this:

#      v1 X Y Z (all)
# 1     A 1 2 2     5
# 2     B 1 2 1     4
# 3 (all) 2 4 3     9

Does anyone have any ideas?
Thanks,
Mike L.

______________________________________________
R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list