[R] data recoding problem

Williams Scott Scott.Williams at petermac.org
Mon Apr 23 10:14:20 CEST 2007


Hi R experts,

I have a data recoding problem I cant get my head around - I am not that
great at the subsetting syntax. I have a dataset of longitudinal
toxicity data (for multistate modelling) for which I want to also want
to do a simple Kaplan-Meier curve of the time to first toxic event.

The data for 2 cases presently looks like this (one with an event, the
other without), with id representing each person on study, and follow-up
time and status:


> tox

 id      t       event

 PMC011  0.000     0
 PMC011  3.154     0
 PMC011  5.914     0
 PMC011 12.353     0
 PMC011 18.103     1
 PMC011 24.312     0
 PMC011 30.029     0
 PMC011 47.967     0
 PMC011 96.953     0
 PMC016  0.000     0
 PMC016  3.943     0
 PMC016  5.782     0
 PMC016 11.762     0
 PMC016 17.741     0
 PMC016 23.951     0
 PMC016 28.353     0
 PMC016 44.747     0
 PMC016 89.692     0 

So what I need is an output in the same column format, containing each
of the unique values of id:

PMC011 18.103     1
PMC016 89.692     0

In my head, I would do this by looking at each unique value of id (each
unique case), look down the event data of each of these cases - if there
is no event (event==0), then I would go to the time column (t) and find
the max value and paste this time along with a 0 for event. If there
were an event, I would then need to find the minimum time associated
with an event to paste across with the event marker. I am sure someone
out there can point me in the right direction to do this without tedious
and slow loops. Any help greatly appreciated.

Cheers

Scott
_____________________________

Dr. Scott Williams

MBBS BScMed FRANZCR

Radiation Oncologist

Peter MacCallum Cancer Centre

Melbourne, Australia

ph +61 3 9656 1111

fax +61 3 9656 1424

scott.williams at petermac.org



More information about the R-help mailing list