[R] Manipulation of longitudinal data by row

marcel curlin marcelcurlin at gmail.com
Fri Dec 14 17:37:26 CET 2012


I have a dataset of the form below, consisting of one unique ID per
row, followed by a series of visit dates.  At each visit there are
values for 3 dichotomous variables. Of the 8 different possible
combinations of the three variables, 4  are "abnormal" and the
remaining 4 are "normal". Everyone starts out abnormal, and then
either continues to be abnormal at subsequent visits, or resolves to a
normal pattern at a later visit (I ignore reversion back to abnormal -
once they are normal, they are normal)

I have to end up with 4 new columns indicating 1) date of last
completed visit (regardless of intervening "NAs", 2) whether an ID
resolved or stayed abnormal, 3) if resolved, what the resolution
pattern was and 4) what the date of resolution was. NAs always come in
groups of 4 (ie no visit date, and no value for the 3 variables) and
are ignored.

Eventually I have to determine mean time to resolution, mean follow-up
time, etc and I think I can do that, but the first part is a bit
beyond my coding skill. Suggestions appreciated.

tC <- textConnection("
ID V1Date V1a V1b V1c V2date V2a V2b V2c V3date V3a V3b V3c
001 4/5/12 Yes Yes No 6/18/12 Yes No Yes NA NA NA NA
002 1/22/12 No No Yes 7/5/12 Yes No Yes NA NA NA NA
003 4/5/12 Yes No No 9/4/12 Yes No Yes 11/1/12 Yes No Yes
004 8/18/12 Yes Yes Yes 9/22/12 Yes No Yes NA NA NA NA
005 9/6/12 Yes No No NA NA NA NA 12/4/12 Yes No Yes
")
data1 <- read.table(header=TRUE, tC)
close.connection(tC)
rm(tC)



More information about the R-help mailing list