[R] Counting observations split by a factor when there are NAs in the data

Jenifer Larson-Hall jenifer at unt.edu
Mon Jul 10 21:24:12 CEST 2006


I am a very novice R user, a social scientist (linguist) who is trying
to learn to use R after being very familiar with SPSS. Please be kind!

My concern:
I cannot figure out a way to get an accurate count of observations of
one column of data split by a factor when there are NAs in the data.

I know how to use commands like tapply and summaryBy to obtain other
summary statistics I am interested in, such as the following:
tapply(RLWTEST, list(STATUS), mean, na.rm=T)
summaryBy(RLWTEST~STATUS, data=lh.forgotten, FUN=c(mean, sd, min, max),
na.rm=T)

However, with tapply I know I cannot use length to get a count where
there are NAs. summaryBy appears to work the same way. I do know how to
get a count of the entire column using sum:
sum(!is.na(lh.forgotten$RLWTEST))

However, this does not give me a count split up by my factor (STATUS). I
have looked through Daalgard (2002) and Verzani (2005), and have
searched the help files, but with no luck.

Thank you in advance for your help. I love R and am interested in making
it more accessible to social scientist types like me. I know it can do
everything SPSS can and more, but sometimes the very simplest things
seem to be a lot harder in R.

Jenifer

Dr. Jenifer Larson-Hall
Assistant Professor of Linguistics
University of North Texas
(940)369-8950



More information about the R-help mailing list