[R] cohort sampling

Terry Therneau therneau at mayo.edu
Tue Jul 1 14:55:00 CEST 2008


> Now that we have case cohort model , we have 1000 people and 50 cases

>   Let the first       10 cases occur at the same time
>        second     10         "
>        third       10         "
>        fourth       10        "
>        fifth        10        "
> How easy is it to randomly sample 50 different
> cohort controls for each group?

>That is:

>randomly sample 50 cohort controls for the first 10 cases from all 1,000

>randomly sample 50 new cohort controls for the second 10 cases from the
> surviving 990
...

---------------
 
 Your message actually describes a nested control design, a case-cohort design 
would sample from all subjects at the start of the study.  Note that it is 
important in these designs to not look into the future, that is, someone who 
becomes a case at time t+s is still eligible to be a control at time t.
  Here is some sample code, I am sure that others can do better.  Assume 
variables 'time' = follow-up time for each subject, status = 1 if there was an 
event at the last follow-up, and x1, x2 are covariates.  Assume time>0 for all 
subjects.
  
  n <- length(time)
  casetime <- unique(time[status==1])   # all the event times
  chosen <- rep(0,n)                    # marks the case and control groups
  for (i in casetime) {
  	cases <- (time==i & status==1)
  	potential <- (1:n)[!cases & chosen==0 & time >=i]  #potential controls  	
  	new.control <- sample(potential, 50)    # sample 50 of them
  	chosen[new.control] <- i                # remember who was chosen
  	chosen[cases] <- i                      # link them to the right case
  	}
  fit <- coxph(Surv(time, status) ~ x1 + x2 + strata(chosen), 
  	        subset= (chosen > 0))
  	        
  	       
  	       
  	Terry Therneau



More information about the R-help mailing list