[R] trouble calculating rates--sometimes the denominator is missing

Christopher W. Ryan cryan at binghamton.edu
Wed Mar 10 16:30:19 CET 2010


Every day I get a csv file containing the names of the 64 schools in our 
county, the number of students sent home ill, and the number of students 
absent (plus lots of other variables). The file is cumulative since fall 
of 2009. It is in "long" format: one line per school per day.

Each line is also supposed to contain the total number of students 
enrolled in the school. That number doesn't change often or much, so the 
same value is usually repeated on each line for each school. Thus 
calculating proportion of students absent or sent home ill is easy (see 
lines between the #####); here is the beginning of my code (my apologies 
for the word-wrapping, I use some long variable names):

setwd("C:/data/bchd/schoolsurveillance")
library(ggplot2)
library(doBy)
library(reshape)
data <- read.csv("C:/DATA/BCHD/schoolsurveillance/Broome_02MAR10.csv", 
header=TRUE, sep=",", fill=TRUE)
data$date <- as.character(data$ReportingDate)
data$date <- as.Date(data$ReportingDate, format="%d%b%y")
####
data$PercentStudentsAbsent <- 
data$StudentsAbsentTotal/data$TotalStudentsEnrolled
data$PercentSentHome <- data$SentHomeTotal/data$TotalStudentsEnrolled
####
attach(data)

The problem is that sometimes, in some of the daily files, the 
TotalStudentsEnrolled field is left entirely blank--in every record. 
Unfortunately the data collection system is out of my hands, and still a 
little rough around the edges. The powers-that-be can put those numbers 
back in on the subsequent day, then my code runs fine. But if possible, 
I want to make my code less susceptible to this external "threat."

What would be a good way to "store up" the names of the 64 schools and 
their total enrollments (which are basically static), and them use those 
values for the denominators for the rates as calculated above (####), 
rather than relying on always having a complete, rectangular, data file, 
every line containing the necessary value for a denominator?

Thanks.
-- 
Christopher W. Ryan, MD
SUNY Upstate Medical University Clinical Campus at Binghamton
425 Robinson Street, Binghamton, NY  13904
cryanatbinghamtondotedu

"If you want to build a ship, don't drum up the men to gather wood, 
divide the work and give orders. Instead, teach them to yearn for the 
vast and endless sea."  [Antoine de St. Exupery]



More information about the R-help mailing list