[R] Dropping "trailing zeroes" in longitudinal data

David Atkins datkins at u.washington.edu
Mon Apr 26 21:23:15 CEST 2010


Background: Our research group collected data from students via the web 
about their drinking habits (alcohol) over the last 90 days.  As you 
might guess, some students seem to have lost interest and completed some 
information but not all.  Unfortunately, the survey was programmed to 
"pre-populate" the fields with zeroes (to make it easier for students to 
complete).

Obviously, when we see a stretch of zeroes, we've no idea whether this 
is "true" data or not, but we'd like to at least do some sensitivity 
analyses by dropping "trailing zeroes" (ie, when there are non-zero 
responses for some duration of the data that then "flat line" into all 
zeroes to the end of the time period)

I've included a toy dataset below.

Basically, we have the data in the "long" format, and what I'd like to 
do is subset the data.frame by deleting rows that occur at the end of a 
person's data that are all zeroes.  In a nutshell, select rows from a 
person that are continuously zero, up to first non-zero, starting at the 
end of their data (which, below, would be time = 10).

With the toy data, this would be the last 6 rows of ids #10 and #8 (for 
example).  I can begin to think about how I might do this via 
grep/regexp but am a bit stumped about how to translate that to this 
type of data.

Any thoughts appreciated.

cheers, Dave

### toy dataset
set.seed(123)
toy.df <- data.frame(id = factor(rep(1:10, each=10)),
						time = rep(1:10, 10),
					   dv = rnbinom(100, mu = 0.5, size = 100))
toy.df

library(lattice)

xyplot(dv ~ time | id, data = toy.df, type = c("g","l"))

-- 
Dave Atkins, PhD
Research Associate Professor
Department of Psychiatry and Behavioral Science
University of Washington
datkins at u.washington.edu

Center for the Study of Health and Risk Behaviors (CSHRB)		
1100 NE 45th Street, Suite 300 	
Seattle, WA  98105 	
206-616-3879 	
http://depts.washington.edu/cshrb/
(Mon-Wed)	

Center for Healthcare Improvement, for Addictions, Mental Illness,
   Medically Vulnerable Populations (CHAMMP)
325 9th Avenue, 2HH-15
Box 359911
Seattle, WA 98104?
206-897-4210
http://www.chammp.org
(Thurs)



More information about the R-help mailing list